Meteorographica.data.twcr

This package retrieves and loads data from the Twentieth Century Reanalysis (20CR).

It retrieves the data from the 20CR portal at NERSC.

At the moment, only version ‘2c’ of 20CR is supported for public use.

Only hourly data is supported (no daily or monthly averages) for 5 surface variables:

  • Mean-sea-level pressure: ‘mslp’
  • 2m air temperature: ‘air.2m’
  • Precipitation rate: ‘prate’
  • 10m meridional wind: ‘uwnd.10m’
  • 10m zonal wind: ‘vwnd.10m’

Data retrieved is stored in directory $SCRATCH/20CR - the ‘SCRATCH’ environment variable must be set. Data is retrieved in 1-year batches

For example:

import Meteorographica.data.twcr as twcr
twcr.fetch('prate',1987,version='2c')

Will retrieve precipitation rate data for the whole of 1987, and

pr=twcr.load('prate',1987,3,12,15.25,version='2c')

will then load the precipitation rates at quarter past 3pm on March 12 1987 from the retrieved dataset as an iris.cube.Cube. Note that as 20CR only provides data at 6-hourly or 3-hourly intervals, the value for 15.25 will be interpolated between the outputs. Also, as 20CR is an ensemble dataset, the result will include all 56 ensemble members.

Observations files are also available. They can be fetched with:

twcr.fetch_observations(1987,version='2c')

There is one observations file for each 6-hourly assimilation run. Load all the observations available to the assimilation run for 12 noon on March 12 1987 (as a pandas.DataFrame) with:

o=twcr.load_observations_1file(1987,3,12,12,version='2c')

or just load all the observations valid between 6am and 6pm that day with:

import datetime
o=twcr.load_observations(datetime.datetime(1987,3,12,6),
                         datetime.datetime(1987,3,12,18),
                         version='2c')

Meteorographica.data.twcr.fetch(variable, year, month=1, day=1, version='none')[source]

Get data for one variable, from the 20CR archive at NERSC.

Data wil be stored locally in directory $SCRATCH/20CR, to be retrieved by load(). If the local file that would be produced already exists, this function does nothing.

For 20CR version 2c, the data is retrieved in calendar year blocks, and the ‘month’ and ‘day’ arguments are ignored.

Parameters:
  • variable (str) – Variable to fetch (e.g. ‘prmsl’).
  • year (int) – Year to get data for.
  • month (int, optional) – Month to get data for (1-12).
  • day (int, optional) – Day to get data for (1-31).
  • version (str) – 20CR version to retrieve data for.
Raises:

StandardError – If variable is not a supported value.


Meteorographica.data.twcr.fetch_observations(year, month=None, day=None, version='none')[source]

Get observations from the 20CR archive at NERSC.

Data wil be stored locally in directory $SCRATCH/20CR, to be retrieved by load_observations(). If the local files that would be produced already exists, this function does nothing.

For 20CR version 2c, the data is retrieved in calendar year blocks, and the ‘month’ and ‘day’ arguments are ignored.

Parameters:
  • year (int) – Year to get data for.
  • month (int, optional) – Month to get data for (1-12).
  • day (int, optional) – Day to get data for (1-31).
  • version (str) – 20CR version to retrieve data for.
Raises:

StandardError – If variable is not a supported value.


Meteorographica.data.twcr.load(variable, year, month, day, hour, version)[source]

Load requested data from disc, interpolating if necessary.

Data must be available in directory $SCRATCH/20CR, previously retrieved by fetch().

Parameters:
  • variable (str) – Variable to fetch (e.g. ‘prmsl’)
  • year (int) – Year to get data for.
  • month (int) – Month to get data for (1-12).
  • day (int) – Day to get data for (1-31).
  • hour (float) – Hour to get data for (0-23.99). Note that this isn’t an integer, for minutes and seconds, use fractions of an hour.
  • version (str) – 20CR version to load data from.
Returns:

Global field of variable at time.

Return type:

iris.cube.Cube

Note that 20CR data is only output every 6 hours (prmsl) or 3 hours, so if hour%3!=0, the result may be linearly interpolated in time. If you want data after 18:00 on the last day of a month, you will need to fetch the next month’s data too, as it will be used in the interpolation.

Raises:StandardError – Version number not supported, or data not on disc - see fetch()

Meteorographica.data.twcr.load_observations(start, end, version='none')[source]

Load observations from disc, for the selected period

Data must be available in directory $SCRATCH/20CR, previously retrieved by fetch().

Parameters:
Returns:

Dataframe of observations.

Return type:

pandas.DataFrame

Raises:

StandardError – Version number not supported, or data not on disc - see fetch_observations()


Meteorographica.data.twcr.load_observations_1file(year, month, day, hour, version='none')[source]

Load observations from disc, that were used in the assimilation run at the time specified.

Data must be available in directory $SCRATCH/20CR, previously retrieved by fetch_observations().

Parameters:
  • year (int) – Year of assimilation run.
  • month (int) – Month of assimilation run (1-12)
  • day (int) – Day of assimilation run (1-31).
  • hour (int) – Hour of assimilation run (0-23).
  • version (str) – 20CR version to load data from.
Returns:

Dataframe of observations.

Return type:

pandas.DataFrame

Raises:

StandardError – Version number not supported, or data not on disc - see fetch_observations()


Meteorographica.data.twcr.load_observations_fortime(v_time, version='none')[source]

Load observations from disc, that contribute to fields ata given time

Data must be available in directory $SCRATCH/20CR, previously retrieved by fetch().

At the times when assimilation takes place, all the observations used at that time are provided by load_observations_1file() - this function serves the same function, but for intermediate times, where fields are obtained by interpolation. It gets all the observations from each field used in the interpolation, and assigns a weight to each one - the same as the weight used in interpolating the fields.

Parameters:
  • v_time (datetime.datetime) – Get observations associated with this time.
  • version (str) – 20CR version to load data from.
Returns:

same as from load_observations(), except with aded column ‘weight’ giving the weight of each observation at the given time.

Return type:

pandas.DataFrame

Raises:

StandardError – Version number not supported, or data not on disc - see fetch_observations()