Getting started

This dataset is kept under version control in a git repository. The repository is hosted on GitHub (and the documentation made with GitHub Pages). The repository is https://github.com/philip-brohan/Auto-transcription-benchmark-2-Fake-data.

If you are familiar with GitHub, you already know what to do (fork or clone the repository): If you’d prefer not to bother with that, you can download the whole thing as a zip file.

As well as downloading the scripts, some setup is necessary to run them successfully:

These scripts need to know where to put their output files. They rely on an environment variable SCRATCH - set this variable to a directory with plenty of free disc space.

These scripts will only work in a python environment with the appropriate python version and libraries available. I use conda to manage the required python environment - which is specified in a yaml file:

name: atb2
channels:
  - default
  - conda-forge
dependencies:
  - python=3.7
  - matplotlib=3.2.*
# Tool to make the docs
  - sphinx=3.1.*
# Optional, code formatter
  - black

Install anaconda or miniconda, create and activate the environment in that yaml file, and all the scripts in this repository should run successfully.