The first half of my project’s title, data assimilation (or DA), is an umbrella term which describes many different statistical methods. The common thread underlying these methods is that they combine “theory” with data to obtain estimates. In this context, theory refers to knowledge known before collecting data, and it usually takes the form of some kind of mathematical model. One could obtain estimates using only the model itself, but the philosophy underpinning DA is that by collecting data, better estimates can be obtained by using the model together with the data (Wikle and Berliner 2007).

These methods mainly originated in the geosciences, from applications such as weather forecasting. It’s clear to see why only relying on “theory” would be inadequate in weather prediction, considering the complex and changeable nature of the weather. By setting up instruments that collect data from the atmosphere, and combining this with models, much better forecasts can be made (DARC 2022). Weather forecasts are often inaccurate, though, and they’ll get better as research into DA progresses further. Thankfully, it’s a very busy area of research.

I actually looked at using DA for something very different: an internal water wave evolving in time. Just to clarify, internal waves occur inside of the ocean, not on the surface, and they’re often a lot bigger than surface waves (Holloway et al. 1997). After having the programming foundations kindly laid out for me by researchers from UWA, I was already well on my way to simulating these waves in Python as a “test subject” for using DA. These waves were governed by so-called Korteweg-De Vries equations, which makes up the second half of my project’s title.

As far as the code went, getting the wave simulations to work was the easy part, thanks to having the groundwork already done for me. Writing up an algorithm to implement DA on the waves turned out to be a lot harder, especially once I ran into computational hurdles. At some point, the algorithm would’ve taken multiple hours to run in full. Ultimately, I was able to trim this down to about 10 minutes with some decisive edits. In my attempts to handle these issues, I even tried parallel computing, although it ended up being unnecessary (for the first six weeks of the project, anyway). This process of dealing with “computational struggles” was new to me, and it wasn’t a challenge I expected to deal with when I set out on this project.

DARC (2022), What is data assimilation? [online] Available at: https://research.reading.ac.uk/met-darc/aboutus/what-is-data-assimilation/. [Accessed 21 Feb. 2022].

Holloway, P. E., Pelinovsky, E., Talipova, T., and Barnes, B. (1997), ‘A Nonlinear Model of Internal Tide Transformation on the Australian North West Shelf’, Journal of Physical Oceanography 27(6), pp. 871–896.

Wikle, C. K. and Berliner, L. M. (2007), ‘A Bayesian Tutorial for Data Assimilation’, Physica D: Nonlinear Phenomena 230(1–2), pp. 1–16.

Michael Kaminski
University of Wollongong