Predicting solar flares has important applications in communications, safety of operations for satellites, etc., which is why it is relevant in the context of space weather. Most of the previous work using Machine Learning (ML) recently focused on classifying the flares, assigning labels such as M, X, etc. Such discrete labels may overlook physical information rather than the underlying X-ray flux, which is a continuous label. We use Convolutional Neural Networks (CNNs) to predict X-ray flux from a combination of Helioseismic and Magnetic Imager (HMI) and Extreme Ultraviolet (EUV) images of the Sun using a curated ML dataset [1] created by Solar Dynamics Observatory (SDO) AIA and HMI instrument data. The inputs are chosen to represent the relevant layers of the solar atmosphere: the line-of-sight magnetograms from HMI (photosphere), 94 Å (flaring regions), 171 Å (quiet sun), 193 Å (coronal structures), 304 Å (chromosphere) AIA wavelengths. Data are processed to have time matching between SDO images [1] and GOES X-ray fluxes 30 minutes maximum. Limb-brightening correction using a geometrical function is performed on the relevant AIA wavelengths to avoid possible biases in the predictions. In addition, we compare and contrast the usefulness of original full disk images vs the synoptic maps as the inputs to the CNN. We evaluate the benefit of introducing soft constraints to predict the extremes better. We utilize the self-supervised framework known as Model Genesis [2], originally developed for medical imaging. In this approach, the CNN is pre-trained using an encoder-decoder architecture designed to reconstruct original images of the sun after applying artificial transformations. These transformations include non-rigid deformations and local pixel shuffling. By initializing the CNN with weights from this reconstruction task, we enable it to learn robust features that improve performance on downstream tasks. Next, the encoder head is attached to the CNN to make predictions about the X-ray flux. In order to compare our work to other classifications, we post-process the outputs to better associate the predictions of the X-ray flux with the flare indices. Our goal is to benchmark this approach with state-of-the-art approaches using skills such as True Skill Score (TSS) for categorical predictions and Brier Skill Score (BSS) for probabilistic prediction on a hold-out test set. Finally, using eXplainable AI (XAI) [3], one can find statistical estimates of which active regions and parts of the solar atmosphere may contribute the most to the prediction of flares. [1] Galvez, R., et al. The Astrophysical Journal Supplement Series 242.1 (2019): 7.[2] Zhou, Z., et al. Medical image analysis 67 (2021): 101840.[3] Letzgus, S. et al. (2022). IEEE Signal Processing Magazine, 39(4), 40-58.
Arcetri
Moment closure problem: equation discovery and deep learning techniques applied to kinetic simulations
George Miloshevich, Guiseppe Arró, Emanuel Jess, Francesco Carella, Sophia Köhne, and 4 more authors
We will explore the pivotal role of High-Performance Computing (HPC) and Artificial Intelligence (AI) in advancing heliophysics and space weather forecasting by exploring recent advancements at the Centre for mathematical Plasma-Astrophysics (CmPA) that exemplify the integration of these technologies to enhance our understanding and forecasting capabilities in heliophysics. Key projects such as the Virtual Space Weather Modelling Centre (VSWMC) will be showcased, illustrating its capacity to consolidate various models and observational data into a cohesive framework for space weather prediction. The talk will address the following two core themes. (1) Integration of HPC in Heliophysics and Space Weather modelling: Demonstrating how HPC and advanced numerical techniques accelerate simulation capabilities, enabling the consideration of more physical effects (such as heating, radiation, thermal conduction, and even time-dependence) leading to more accurate predictions using more realistic global coronal models such as COCONUT, based on object-oriented HPC platform CoolFluid. COCONUT can be coupled (or integrated into) with heliospheric models such as Icarus and the EUropean Heliospheric FORecasting Information Asset (EUHFORIA), providing critical insights into solar wind and CME propagation and their impact on Earth’s magnetosphere. More accurate regional modelling of magnetospheres can be achieved using the implicit particle-in-cell (iPic3D) and the energy-conserving particle-in-cell (ECsim). (2). Automatics of SpAce exPloration (ASAP) : We will explore the utility of AI and ML algorithms for on-board analysis, prediction of space weather events within the context of the EU funded ASAP project. (3) AI-based surrogate modelling: We will discuss how high-fidelity surrogate models can be trained to potentially replace expensive physics modules further accelerating space weather products.
ESPM
Moment closure problem: equation discovery and deep learning techniques applied to kinetic plasma simulations
George Miloshevich, Guiseppe Arró, Emanuel Jess, Francesco Carella, Sophia Köhne, and 4 more authors
In 17th European Solar Physics Meeting, Turin, Italy, Sep 2024
Reduced order modelling (ROM) plays an important role in the descriptions of different plasma environments such as heliosphere, solar wind and beyond. ROMs can be obtained via analytical closures; however, such approaches are limited when distribution functions are far from Maxwellian and/or in weaker guide fields. To push the envelope of ROMs in plasmas we apply machine learning frameworks that seek to extract the relevant terms that need to be kept in the equations for moments (EoMs), identifying terms such as anisotropic pressure in the momentum equation. This is done systematically on several datasets generated via kinetic simulations: 1D Landau damping, 2D decaying turbulence, 2D magnetic reconnection. The sparse/symbolic regression techniques used include wSINDy and PDE-Net. We show examples of successful identification of EoMs. These approaches are compared with multi-layer perceptron and fully connected convolutional neural network trained to reconstruct the pressure tensor and heat flux as a function of local lower-order moments. We show that the method is successful, assuming the test data comes from simulations with guide fields of comparable values to at least a few runs in the training dataset. Interestingly, accuracy of the predicted pressure tensor increases as we add extra runs corresponding to stronger guide fields. These results are promising for the development of global surrogate models for space plasmas that capture Finite Larmor Radius (FLR) effects.
2023
2023
XAIDA
Sampling and forecasting extreme heatwaves using analogs and neural networks
Sampling rare events such as extreme heatwaves whose return period is larger than the length of available observations requires developing and benchmarking new simulation methods. There is growing interest in applying deep learning alongside already existing statistical approaches to better generate and predict rare events. Our goal is to benchmark Stochastic Weather Generator (SWG) [1] based on analogs of circulation, soil moisture and temperature as a tool for sampling tails of distribution as well as forecasting heatwaves in France and Scandinavia using data from General Circulation Model (GCM). Analog method has been successfully implemented in rare event algorithms for low dimensional climate models [2]. SWG is implemented using a Markov chain with hidden states (.e.g. geopotential height at 500 hPa) with Euclidean metric. When applying such methods to climate data two challenges emerge: a large number of degrees of freedom and the difficulty of including slow drivers such as soil moisture alongside circulation patterns. Consequently, we are going to discuss ways of adjusting the distance metric of the analog Markov chain and dimensionality reduction techniques such as EOFs and variational auto encoder. By choosing the correct combination of weighted variables in the Euclidean metric and using analogs of only 100 years and generating long synthetic sequences we are able to correctly estimate return times of order 7000 years, which is validated based on a 7200 year long control run. The teleconnection patterns generated thus also look reliable compared to the control run. Next we compare SWG forecasts of heatwaves with a direct supervised approach based on a Convolutional Neural Network (CNN). Both CNN and SWG are trained and validated on exactly the same GCM runs which allows us to conclude that CNN performs better in both regions. One could consider SWG as a baseline approach for CNN for this task.
CI
Stochastic weather generator and deep learning approach for predicting and sampling extreme European heatwaves
We perform probabilistic prediction of 14 day heatwaves with the aim of comparing several data-driven approaches. The methods involve a climate emulator, a deep learning based dimensionality reduction technique and the combined approach. Since the study is methodological it relies on the data generated by an intermediate complexity climate model, which allows longer runs and includes relevant soil-atmosphere interactions. A climate emulator of temperature can be designed from resampling analogs of circulation. This corresponds to a Markov chain, with hidden states (e.g. geopotential height at 500 hPa) and equipped with the nearest analog neighbor metric, which was chosen Euclidean in the study [1]. Analog method has been successfully implemented in rare event algorithms for low dimensional climate models [2]. When applying such methods to general circulation models (GCMs) two challenges emerge: a large number of degrees of freedom and the difficulty of including slow drivers such as soil moisture alongside circulation patterns. Consequently, we are going to discuss ways of adjusting the distance metric of the analog Markov chain. The method which targets both issues is dimensionality reduction which is pursued via training a variational autoencoder [4]. Variational autoencoders approximate the probability distribution of the data and are able to draw new unseen samples. While doing so they project the state of the system to the small-dimensional latent space via a nonlinear transformation (neural network). This is achieved by minimizing the reconstruction loss with a regularization penalty which promotes Gaussian distribution on the latent space. We explore the possibility of computing nearest analog neighbors with a Euclidean metric on this latent space. This approach can be considered as complementary to the one offered by end-to-end supervised training of a neural network [3], but in addition they provide a generator of the dynamics and require much less resources than running the GCM.
EGU
Probabilistic forecasting of heat waves with deep learning
George Miloshevich, Valerian Jacques-Dumas, Pierre Borgnat, Patrice Abry, and Bouchet Freddy
In EGU General Assembly Conference Abstracts, May 2022
One of the big challenges today is to appropriately describe heat waves, which are relevant due to their impact on human society. Common characteristics in mid-latitudes involve meanders of the westerly flow and concomitant large anticyclonic anomalies of the geopotential field. These anomalies form the so-called teleconnection patterns, and thus it is natural to ask how robust such structures are in various models and how much data we require to make statistically significant inferences. In addition, it is natural to ask what are the precursor phenomena that would improve forecasting capabilities of the heat waves. In particular, what kind of long term effect does the soil moisture have and how it compares to the respective quantitative contribution to the predictability of the teleconnection patterns. In order to answer these questions we perform various types of regression on a climate model. We construct the composite maps of the geopotential height at 500 hPa and estimate return times of heatwaves of different severity. Of particular interest to us is a committor function, which is essentially a probability a heat wave occurs given the current state of the system. Committor functions can be efficiently computed using the analogue method, which involves learning a Markov chain that produces synthetic trajectories from the real trajectories. Alternatively they can be estimated using machine learning approach. Finally we compare the composite maps in real dynamics to the ones generated by the Markov chain and observe how well the rare events are sampled, for instance to allow extending the return time plots.
CI
Predicting probability of heat waves using a CNN
George Miloshevich, Patrice Abry, Pierre Borgnat, and Freddy Bouchet
Imbalanced collisionless Alfvén wave turbulence and the inverse cascade of the generalized cross-helicity
George Miloshevich, Thierry Passot, Pierre-Louis Sulem, and Dimitri Laveder
In RAS Specialist Discussion Meeting: "Future Solar and Heliospheric Assets for Space Weather Prediction: Instruments, Modelling and Machine-Learning", May 2022
Deep Neural Networks are rapidly growing foothold in Earth Sciences and elsewhere e.g. in surrogate modeling. These developments serve interests of both weather prediction and climate modeling. Among various difficulties stands reducing uncertainties in future climatologies and meteorological forecasting of extreme events such as heat waves and droughts. By nature, data is scarce for rare events, and so their study is a major challenge. Convolutional neural network is trained on a climate model to predict heat waves. It is constructed with the goal of capturing global information coming from teleconnections of the geopotential with much more localized signal of soil moisture, which generally correlates with heat waves. The main question is to understand how much predictive information can be extracted from geophysical fields using this set-up and how this depends on the data size. Furthermore the issue of smoothness of the prediction along a given trajectory arises and is addressed with transfer learning.
2021
2021
APS
New ways for dynamical prediction of extreme heat waves: rare event simulations and stochastic process-based machine learning.
Freddy Bouchet, Francesco Ragone, Dario Lucente, George Miloshevich, and Corentin Herbert
In Bulletin of the American Physical Society, Apr 2021
In the climate system, extreme events or transitions between climate attractors are of primarily importance for understanding the impact of climate change. Recent extreme heat waves with huge impact are striking examples. However, they cannot be studied with conventional approaches, because they are too rare and realistic models are too complex. We will discuss several new algorithms and theoretical approaches, based on large deviation theory, rare event simulations, and machine learning for stochastic processes, which we have specifically designed for the prediction of the committor function (the probability of the extreme event to occur). We will discuss results for the study of midlatitude extreme heat waves and demonstrate the performance of these tools. Using the best available climate models, our approach shed new light on the fluid mechanics processes which lead to extreme heat waves. We will describe quasi-stationary patterns of turbulent Rossby waves that lead to global teleconnection pattern in connection with heat waves and analyze their dynamics. We stress the relevance of these patterns for recently observed extreme heat waves with huge impact and the prediction potential of our approach.
EGU
Predicting extreme events using dynamics based machine learning.
Dario Lucente, George Miloshevich, Corentin Herbert, and Freddy Bouchet
In EGU General Assembly Conference Abstracts, Apr 2021
Many phenomena in the climate system lie in the gray zone between weather and climate: they are not amenable to deterministic forecast, but they still depend on the initial condition. A natural example is medium-range forecasting, which is inherently probabilistic because it lies beyond the predictability time of the atmosphere. Similarly, one may ask the probability of occurrence of an El Niño event several months ahead of time or the probability of occurrence of a heat wave a few weeks in advance based on the observed atmospheric circulation. In this talk, we introduce a quantity which corresponds precisely to this type of prediction problem: the committor function is the probability for an event to occur in the future, as a function of the current state of the system. In the first part of this presentation, we explain the main mathematical properties of this probabilistic concept, and compute it in the case of a low-dimensional stochastic model for El-Niño, the Jin and Timmerman model. This example allows us to show that the ability to predict the probability of occurrence of the event of interest may differ strongly depending on the initial state: in some regions of phase space, the committor function is smooth (intrinsic probabilistic predictability) and in some other regions, it depends sensitively on the initial condition (intrinsic probabilistic unpredictability). We stress that this predictability concept is markedly different from the deterministic unpredictability arising because of chaotic dynamics and exponential sensivity to initial conditions. The second part of the talk is about how to efficiently compute the committor function from data through several data-driven approaches, such as direct estimates, kernel-based methods and neural networks. We discuss two examples: a) the computation of committor function for the Jin and Timmerman model, b) the computation of committor function for extreme heat waves. Both systems are highly nonlinear but, considering the dimensionality of the two, their level of complexity is profoundly different. This therefore allows us to explore and discuss the performance and limits of the different methods proposed. Finally, we propose a method for learning effective dynamics by introducing a Markov chain on the data. Using the Markov chain we are able to quickly and easily compute many interesting quantities of the original system, including the committor function. The goal is to overcome some of the limitations of the methods introduced previously and to develop a robust algorithm that can be useful even in the lack of data.
EGU
Drivers of midlatitude extreme heat waves revealed by analogues and machine learning
George Miloshevich, Dario Lucente, Corentin Herbert, and Freddy Bouchet
In EGU General Assembly Conference Abstracts, Apr 2021
One of the big challenges today is to appropriately describe heat waves, which are relevant due to their impact on human society. Common characteristics in mid-latitudes involve meanders of the westerly flow and concomitant large anticyclonic anomalies of the geopotential field. These anomalies form the so-called teleconnection patterns, and thus it is natural to ask how robust such structures are in various models and how much data we require to make statistically significant inferences. In addition, it is natural to ask what are the precursor phenomena that would improve forecasting capabilities of the heat waves. In particular, what kind of long term effect does the soil moisture have and how it compares to the respective quantitative contribution to the predictability of the teleconnection patterns. In order to answer these questions we perform various types of regression on a climate model. We construct the composite maps of the geopotential height at 500 hPa and estimate return times of heatwaves of different severity. Of particular interest to us is a committor function, which is essentially a probability a heat wave occurs given the current state of the system. Committor functions can be efficiently computed using the analogue method, which involves learning a Markov chain that produces synthetic trajectories from the real trajectories. Alternatively they can be estimated using machine learning approach. Finally we compare the composite maps in real dynamics to the ones generated by the Markov chain and observe how well the rare events are sampled, for instance to allow extending the return time plots.
2020
2020
UCA
Imbalanced collisionless Alfvén wave turbulence and the inverse cascade of the generalized cross-helicity
George Miloshevich, Thierry Passot, Pierre-Louis Sulem, and Dimitri Laveder
The direction of cascades in a two-dimensional model that takes electron inertia and ion sound Larmor radius into account is studied, resulting in analytical expressions for the absolute equilibrium states of the energy and helicities. It is found that typically both the energy and magnetic helicity at scales shorter than electron skin depth have direct cascade, while at large scales the helicity has an inverse cascade as established earlier for reduced magnetohydrodynamics (MHD). It is also found that the introduction of gyro-effects allows for the existence of negative temperature (conjugate to energy) states and the condensation of energy to the large scales. Comparisons between two- and three-dimensional extended MHD models (MHD with two-fluid effects) show qualitative agreement between the two.
APS DPP
Analytical and numerical evidence of the cascade reversal due to electron inertia.
George Miloshevich, Santiago J Benavides, Philip J. Morrison, and Emanuele Tassi
In Bulletin of the American Physical Society, Nov 2018
Astrophysical plasmas exist in a large range of length-scales throughout the universe. At sufficiently small scales, one must account for many two-fluid effects, such as the ion or electron skin-depths, as well as Larmor radii. These effects occur when ignoring electron mass, for instance, is no longer possible. We are interested in studying homogeneous turbulence in the context of such plasma models. In particular, we look at a 2D extended MHD model, where the effect of electron inertia may be non-negligible . This model has been applied to understanding collisionless reconnection in past. Two-dimensional simulations are less computationally intensive and thus allow us to perform a parameter study of many runs, in which we look at the cascade of conserved quadratic quantities as we vary the effective electron skin-depth. We find that the cascade directions depend strongly on whether the length scale is relevant in the system, and, furthermore, that the transition in cascade directions happens in a critical way, as was previously observed in other studies of the kind but in different systems.
AAPPS
Relativistic Extended Magnetohydrodynamics: action formalism and physical properties.
There exist a wide class of systems that exhibit non-ideal effects such as Hall drift and electron inertia. The latter plays role on characteristic length scales smaller than the electron skin depth. To gain relevant understanding it is necessary to work with models such as extended MHD (XMHD) that capture these microscopic effects. XMHD is endowed with topological invariants – two helicities emerging from the Hamiltonian structure and useful for the Hamiltonian Energy-Casimir method [1]. In MHD turbulence the inverse cascade of magnetic helicity is often invoked to explain dynamo action. However, we predict [2] analytically that the phenomenon is suppressed at the electron skin depth, i.e. it appears that the cascade reverses direction. The ongoing investigations focus on a simplified 2D case, which is more amenable to numerical analysis. The analytical queries reveal similar behavior to 3D cascade reversal so we are confident that our 2D case study should be representative.
2016
2016
APS DPP
On the cascade reversal at the electron skin depth.
Extended MHD, a 1-fluid model endowed with 2-fluid effects (electron inertia and Hall drift) possesses a Hamiltonian structure [1-4]. This formulation is described, as it unifies different classes of extended MHD models (including those that have mutually exclusive effects) [2]. The unification is further highlighted by showing that these models possess common topological invariants that are the generalizations of the fluid/magnetic helicity [3]. They can be expressed naturally in a knot-theoretic framework via the Jones polynomial by exploiting techniques from Chern-Simons theory. It is also shown that extended MHD exhibits other commonalities such as: generalized Kelvin circulation theorems, and the existence of two Lie-dragged 2-forms closely connected with generalizations of the fluid vorticity.
2015
2015
APS
Common Hamiltonian structure and concomitant topological invariants for extended magnetohydrodynamics models.
Extended magnetohydrodynamics (XMHD) includes 2-fluid effects such as electron inertia and the Hall drift absent in ideal MHD. Hamiltonian structure of the XMHD models (Hall MHD, inertial MHD [3] and full XMHD) is presented [1]. Existence of elegant variable transformations that map every XMHD model to a common noncanonical Poisson bracket is highlighted [2]. The bracket is used to derive the existence of two unique helicities (Casimir invariants) for these models, each of which exhibits close similarities with the magnetic and fluid helicities [1,2] - this is highly significant as the latter are important topological invariants. The Lagrangian origins of the helicities and variable transforms, and avenues for future work are outlined.