Getting started¶
This dataset is kept under version control in a git repository. The repository is hosted on GitHub (and the documentation made with GitHub Pages). The repository is https://github.com/philip-brohan/Auto-transcription-benchmark-2-Fake-data.
If you are familiar with GitHub, you already know what to do (fork or clone the repository): If you’d prefer not to bother with that, you can download the whole thing as a zip file.
As well as downloading the scripts, some setup is necessary to run them successfully:
These scripts need to know where to put their output files. They rely on an environment variable SCRATCH - set this variable to a directory with plenty of free disc space.
These scripts will only work in a python environment with the appropriate python version and libraries available. I use conda to manage the required python environment - which is specified in a yaml file:
name: atb2
channels:
- default
- conda-forge
dependencies:
- python=3.7
- matplotlib=3.2.*
# Tool to make the docs
- sphinx=3.1.*
# Optional, code formatter
- black
Install anaconda or miniconda, create and activate the environment in that yaml file, and all the scripts in this repository should run successfully.