How to use this dataset

This dataset is kept under version control in a git repository. The repository is hosted on GitHub (and the documentation made with GitHub Pages). The repository is https://github.com/philip-brohan/OCR-weatherrescue.

If you are familiar with GitHub, you already know what to do: If you’d prefer not to bother with that, you can download the whole dataset as a zip file.

The dataset is in two parts: A set of 81 document images - each a photograph of a table of numbers; and a set of 81 CSV files - containing those same numbers in computer-readable format. The challenge is to write software that converts the former to the latter.

If you try this, please let us know, by raising an issue. You are not obliged to do this, but it would help.