New PhD student in the SAPHiR project: Killian PUJOL - heavy rainfall forecast with artificial intelligence

In the context of his thesis, Killian PUJOL, a new PhD student from Laboratoire d’Aérologie (LAERO) in Toulouse, has spent a month and a half working at the Laboratoire Sciences Pour l'Environnement (SPE) from the University of Corsica with Roberta BAGGIO, Jean-François MUZY and Jean-Baptiste FILIPPI.

His thesis, co-supervised by both laboratories, aims to improve heavy rainfall forecast in Corsica with artificial intelligence algorithms.

After his graduation in 2022 from the National School of Electrical Engineering, Electronics, Computer Science, Hydraulics and Telecommunications (ENSEEIHT), specializing in hydraulic engineering, Killian worked for a year in Hydrology before applying to this thesis, which is part of project ANR SAPHiR that aims to improve high resolution weather prediction.

This month and a half long stay was organized in order to establish a common set of tools and knowledge of Machine Learning between the SPE and LAEROfor the continuation of Killian’s work.

His arrival in Corte was perfectly synchronized with the launch of a challenge proposed by Météo-France on kaggle.com, a data science competition plateform.

The goal of the challenge was to compute daily rainfall forecast based on a sample from MeteoNet, a dataset made by MétéoFrance for data scientists. The dataset was composed of several different types of data: data observed from ground stations and weather forecast prediction data from Arome (high resolution weather model) and Arpege (global weather model), covering the north-west part of France :

Figure 1: Ground Station localisation ; Figure 2: Temperature prediction from Arome (05/01/2016, 00h)

Using a rather simple neural network model, Roberta, Jean-François and Killian made several predictions looking for the best combination using, as a first step, ground stations and Arome datas. To evaluate their results they used 3 known predictions as references : Arome rainfall predictions, Arome rainfall predictions optimized in a neural network (which takes Arome predictions down close to 0mm) and a simple forecast consisting in always predicting that it will not rain the next day.

When using every valid datas in both ground station dataset and Arome dataset, the model shows improved predictions, with better results than the references presented above.

Figure 3: Relative error comparison between neural network prediction (1st box), Arome prediction (2nd box), Arome prediction otpimized in a neural network (3rd box) and artificial prediction with only 0mm accumulation of rain (4th box)

Figure 3 shows that, most of the time, the neural network prediction provides better performances. In reality it struggles to predict the heaviest rainfall episodes. This reflects the fact that the cost function requested for the challenge (MAPE) strongly penalizes false negative predictions. It results that the model performs better for the prediction of low levels of rain (i.e. with precipitations between 0 to 15 mm).

CHRISTOPHE PAOLI | Mise à jour le 25/07/2023