How to reproduce and extend this work¶
This project is designed to be easy to reproduce and extend. Everything involved is kept under version control in a git repository. The repository is hosted on GitHub (and the documentation made with GitHub Pages). The repository is https://github.com/philip-brohan/Ship_logs_at_NARA; it contains everything you need to reproduce or extend this work.
As well as downloading the software, some setup is necessary to run it successfully:
These scripts need to know where to put their output files. They rely on an environment variable
SCRATCH - set this variable to a directory with plenty of free disc space.
These scripts will only work in a environment with the appropriate software and libraries available. I use conda to manage the required environment - which is specified in a yaml file:
name: NARA channels: - default - conda-forge dependencies: - python=3.7 - awscli # Optional, code formatter - black # Optional - documentation generator - sphinx=3.*.*
Install anaconda or miniconda, create and activate the environment in that yaml file, and all the scripts in this repository should run successfully.