Setting up the software environment¶
To run the scripts in this repository, you will need to set up an environment with the necessary compute resources.
First, you will need to assign some disc space for the output files. The scripts in this repository will write a lot of data to disc, and you will need to have a few hundred gigabytes of free space available. They rely on an environment variable SCRATCH
.
Set the
SCRATCH
environment variable to a directory with plenty of free disc space.
Then, you will need to install some software. The software is all open-source, and should be available for most operating systems (but it’s only been tested on Linux-x86). By far the easiest way to do this is to use conda:
Install anaconda or miniconda.
When you have conda installed, you can create an environment with all the necessary software by using the YML file in this repository:
Edit the YML file
DCVAE-Climate.yml
to set thePYTHONPATH
environment variable to the directory you have installed the code in.Create and activate the
DCVAE-Climate
environment specified in the YML fileDCVAE-Climate.yml
.
name: DCVAE-Climate
# Updated 2025-02-06 to use only the conda-forge channel and to update cdsapi to use the new climate data store.
# Also updates various software packages to more recent versions.
channels:
- conda-forge
dependencies:
# Basics
- python=3.11
- libwebp>=1.3.2 # Earlier versions have a security vulnerability
- iris=3
- pandas=2
- cmocean=4
- parallel
- zarr=2
# Older matplotlib - 3.10 screws up pcolorfast and background colour (why?)
- matplotlib=3.9.2
# GPU support - need these for tensorflow to pick up GPU versions
- cuda
- cudnn
# Get data from ERA5
# You'll need to register, see https://cds.climate.copernicus.eu/api-how-to
- cdsapi>=0.7.2
- nco=5 # Need ncks to fix broken CDS files
# Optional, code formatter
- black
# Optional - documentation generator
- sphinx
# Some packages are only available via pip
- pip
- pip:
- tensorflow
# For bilinear interpolation
- tensorflow-addons
# For input space search
- tensorflow-probability
# Unused, but required by tfp?
- tf-keras
# For efficient data IO
- tensorstore
# Tell python to look for modules in the root directory of the project
# (A hack, needs to be edited for every installation, but makes code
# management much easier.)
# Replace with the path to your project directory root.
variables:
PYTHONPATH: /home/users/philip.brohan/Projects/DCVAE_Climate
# Fix really weird error in scipy optimize (loaded by iris)
# This should be CONDA_ENVS_PATH/DCVAE-Climate/lib
# but you canb'yt do variable expansion here - so set it for each system.
LD_LIBRARY_PATH: /data/users/philip.brohan/conda/environments/DCVAE-Climate/lib