Getting climate data into TensorFlow

To train and use ML models on climate data, we need to set up an input pipeline: To get the data out of its archive, convert it into tf.tensor format, and organise it into a tf.data.Dataset which can be passed to the TensorFlow model as a data source.

It’s most efficient to do this in two steps: First to extract, convert, and store the data on a fast disc in native TensorFlow format. Then those native files can be repeatedly used in model training.