AI and weather forecasting: NWP vs MLWP models

Thu 28 March 2024

5 minutes read

Image: AdobeStock

Numerical weather prediction (NWP) models have been around for years. The models used are constantly improving, mainly due to an increase in available computing power. However, the field of Machine Learning Weather Prediction (MLWP) has developed very rapidly during the recent years. In this blog, we will elaborate on both NWP and MLWP and explain the differences between these way of producing weather forecast. 

History of numerical weather prediction

The foundations of Numerical Weather Prediction (NWP) were laid in the mid-20th century with the development of numerical methods for solving the fundamental equations of atmospheric dynamics. Pioneers like Lewis Fry Richardson envisioned using numerical calculations to predict the weather. However, computational resources were severely limited, and practical implementations were not yet feasible.

In the 1970s and 1980s, NWP models transitioned from regional to global scales, enabling meteorologists to simulate weather patterns and phenomena across the entire Earth. These models incorporated more sophisticated parameterizations for processes like radiation, convection, and boundary layer dynamics, improving their predictive skill. These improvements were mainly possible due to an increase in available computing power. 

In the following years, the amount of available computing power continued to increase steadily. As a result, weather models became increasingly accurate and reliable. Currently, vast spaces are filled with supercomputers dedicated to simulating global weather as precisely as possible.

Working of NWP

Before the model can commence computations, initial conditions are required. These are observations assimilated into an analysis using an optimization technique. This is done to obtain the best possible initial conditions.

In the second step, computations are carried out using these initial values. The knowledge of atmospheric physics is mathematically encapsulated in the model. The weather model consists of differential equations through which the initial conditions are extrapolated over time.

By solving these differential equations for various vertical and horizontal levels over time on the model's grid, a weather forecast is eventually produced. 

Different NWP models

There are numerous weather models nowadays. The most well-known ones are the European Centre for Medium-range Weather Forecasting (ECMWF) and the Global Forecasting System (GFS). These weather models operate at horizontal resolutions of 9 by 9 km and 13 by 13 km respectively. Due to their relatively low resolution, the model can conduct global simulations.

Additionally, there are countless other weather models based on the principles within ECMWF and GFS, but with their own computation schemes and much higher grid resolutions. These models are often used to predict local weather phenomena such as showers and fog. The drawback of these models is that they can only compute over a limited domain (for example, only Western Europe).

In all cases, running these weather models often takes hours or even half-days, requiring immense supercomputers to compute all the differential equations within the model.

NWP vs Machine Learning Weather Prediction (MLWP)

NWP models lie at the basis of today’s weather forecasts. However, in the last few years, a series of breakthroughs in AI research has been translated to weather forecasting. Between 2012 and the early 2020s, new developments in computer vision have led to specific niche applications of AI in meteorology. For example, in the United States, automated prediction of lightning density using satellite images using AI now aids severe weather warnings. These applications are, however, limited in scope – call them niche applications if you will – as they are task-focused. In other words, they do one thing quite well.

More interesting, however, are the developments of the last few years. Specifically, a new weather forecasting paradigm has emerged. It is named Machine Learning Weather Prediction, or MLWP. In this paradigm, research into so-called foundation models is translated to meteorology. Rather than being task-focused, foundation models tend to learn generics from large quantities of data. In the case of models like ChatGPT, these generics are the structure and relationships of words in text. In the case of MLWP, these generics are the states of the atmosphere and how it evolves over time given specific weather conditions.

To make these models, today’s MLWP models use analyses generated by NWP models as examples of how this happens. As a simple example, suppose that we are trying to model how the atmosphere evolves between now and an hour from now – from time T to time T+1, as visualized in the image below. MLWP models take a representation of the atmosphere at time T – by combining various elements at various vertical levels – as model input. This leads to a prediction, in other words the expected state at time T+1. When training is in the early stages, this prediction is often quite wrong. It is compared with the true state at T+1, the ground truth. This leads to a loss value, which tells us how poorly the model performs at that moment. Using mathematical techniques, given the loss and the input, the so-called weights of the MLWP model are slightly altered. We then start again with the input, doing the same repeatedly, until a working model emerges. It is not uncommon that this takes a few weeks using many powerful computers. Be careful though – because success is not guaranteed.

Training MLWP models does not involve using just one example. Typically, many years of atmospheric data are used, as there is a large variation in possible weather conditions that models need to account for. It is not uncommon to see that models are trained with 40 years of data, typically using the ECMWF Reanalysis v5 (ERA5) dataset.

When trained, generating a new worldwide weather forecast for the variables the model was trained with is fast. Using a new analysis at some time T, the model is used to make the prediction for the next timestep T+1. That prediction is then used autoregressively to make the prediction for timestep T+2, and so forth. These days, time steps of 6 hours are common – in other words, moving from T to T+6, then to T+12, and so forth. Making a two-week weather forecast, using the right hardware, takes just a few minutes.


Strengths and weaknesses of MLWP

There are several advantages MLWP models have over NWP models. However, there are some disadvantages as well. To review an in-depth comparison, read our previous article about
Artificial Intelligence in weather forecasting. In short:

Compared to NWP, some strengths of MLWP are that
•    Generating MLWP forecasts is much faster than NWP forecasts
•    Some MLWP models perform better than NWP from the medium term onwards

Some weaknesses of MLWP are that
•    They often have a low resolution and limited variables. Some NWP models have a high resolution
•    They are not physically consistent by design
•    They tend to underestimate extreme values

A new series on AI weather forecasting and MLWP

At Infoplaza, we embrace meteorological innovations to help improve the quality of our forecasts, directly benefiting the operations of our customers. We are thus actively looking into integrating the developments within our operations where possible. Additionally, our Weather Technology team is developing machine learning based forecasting methods itself. To guide our customers into this new paradigm, a new series of articles will appear in the next few months. It will discuss:

•    How MLWP works in the background, to aid non-experts in understanding these technologies intuitively.
•    The current generation of MLWP models, such as Pangu-Weather, the FourCastNet models, GraphCast, FuXi and FengWu and how they perform compared to NWP models.
•    Early attempts to resolve difficulties encountered when developing these models, such as improved resolution and better handling of extremes.
•    Recent innovations such as no longer using analyses but directly using observations instead and regionalization of MLWP models.

These articles will be written by Arjan Willemse and Christian Versloot. Both Arjan and Christian have worked with Infoplaza since 2021. Arjan works as an operational meteorologist and client success manager and is thus able to discuss the impact of AI based weather models from both a meteorological and customer perspective. Christian works in the Weather Technology team on data technology and developing machine learning solutions and is thus able to help readers understand the technology behind these developments.

Ben-Bouallegue, Z., Clare, M. C., Magnusson, L., Gascon, E., Maier-Gerber, M., Janousek, M., ... & Pappenberger, F. (2023). The rise of data-driven weather forecasting. arXiv preprint arXiv:2307.10128.

Bi, K., Xie, L., Zhang, H., Chen, X., Gu, X., & Tian, Q. (2022). Pangu-weather: A 3d high-resolution model for fast and accurate global weather forecast. arXiv preprint arXiv:2211.02556.

Infoplaza. (2024, March 14). Weather forecasting and AI. Infoplaza - Guiding you to the decision point. 


Stay up to date:
guiding you to the decision point

Sign up to receive trusted information and join 4,500+ maritime, traffic, public transport and metocean professionals.