Working with haduk-grid¶
A weakness of this ML-reanalysis approach is that we have been using samples from a preexisting reanalysis to train the VAE. This makes the process somewhat circular - it would be more powerful if we could generate the VAE some other way.
One possibility for this is to train the VAE on pure observations, and the easy place to start is with an existing observational dataset - here we are using HadUK-Grid, specifically the daily maximum air temperature.
The process is the same as with ERA5 global T2m data except that the data format is slightly different, and we are only using 20 dimensions in the latent space (an arbitary decision, but there should be fewer degrees of freedom in the UK near-surface temperatures than the global ones).
Assimilation is done exactly as with ERA5 global T2m data.
The DA method in this case gives a method for making a gridded field from the observations - but we already have such a method (that’s how we made the HadUK-grid fields in the first place). That does not make the DA useless, however. It provides a method for making gridded fields (with uncertainty estimates) from many fewer observations - so it will be useful for extending the gridded fields back in time.