Weather observations from marine expeditions

Surface weather as reconstructed by the 20th Century Reanalysis, and as observed by the Imperial Trans-Antarctic Expedition.

On the left: wind and temperature fields from 20CR, with the routes of the expedition ships Endurance (red) and Aurora (blue). Grey fog marks areas where the reanalysis is very uncertain.

On the right: timeseries of observations rescued from the expedition records (red: observations from Endurance, blue: observations from Aurora). The grey bands show co-located values from 20CR - the width of the band shows the reanalysis ensemble spread. (Light grey: reanalysis not using the expedition's observations, dark grey: reanalysis after including the expedition's observations).


This is not so much a project as a collection of related projects. One approach to data rescue is problem-driven: Identify a research question, then start a project to work on that question, rescuing the data required by the research in the process. We’ve done several such projects, looking at Arctic climate in the 19th Century, the impact of various Antarctic expeditions, and trying to fill major holes in ICOADS and 20CR. This repository collects all the rescued data resulting from this work.

The transcription here was done by individual specialists - numbers typed into spreadsheets from carefully selected archive sources. The sources are mostly published reports, but there are some original manuscripts. This worked very well, by selecting the source documents carefully we can be sure to get good quality data, and because the transcribers are working closely with the people post-processing and using the data it’s easy to make sure the transcriptions are accurate and in a useful format. The downside is that it’s all small-scale work - this process won’t work for big data rescue projects.

Costs and efficiency

We can’t estimate costs and speed for this dataset, as it’s too variable: It’s a set of very different pieces of work, run intermittently over several years.

Date run 2007 - present
Elapsed time N/A (intermittent project)
observations rescued 426,813
Financial cost (per ob.) N/A (very variable)
Effort required (per ob.) N/A (very variable)

How are these numbers estimated?

Lessons learned

  • For small-scale projects, this artisanal approach is the way to go.
  • Expensive - because transcription is only one part of the work, the overall cost/ob is very high.
  • Effective and efficient way of doing science projects - not useable as a large-scale transcription approach.