PASTA is a joint research team between Inria Research Center at Université de Lorraine, CNRS and Université de Lorraine, located at Institut Élie Cartan de Lorraine.
PASTA aims to construct and develop new methods and techniques by promoting and interweaving stochastic modeling and statistical tools to integrate, analyze and enhance real data.
The specificity and the identity of PASTA are:
The leading direction of our research is to develop the topic of data enriched spatio-temporal stochastic models, through a mathematical perspective. Specifically, we jointly leverage major tools of probability and statistics: data analysis and the analytical study of stochastic processes. We aim at exploring the three different aspects, namely: shape, time and environment, of the same phenomenon. These mathematical methodologies will be intended for solving real-life problems through inter-disciplinary and industrial partnerships.
Our research program develops three interwoven axes:
In particular, we are interested in the evolution of stochastic dynamical systems evolving in intricate configuration spaces. These configuration spaces could be spatial positions, graphs, physical spaces with singularities, space of measures, space of chemical compounds, and so on.
When facing a new modeling question, we have to construct the appropriate class of models among
what we call the meta-models.
Meta-models and then models are selected according to the properties to be simulated or inferred
as well as the objectives to be reached.
Among other examples of such meta-models which we regularly use, let us mention
Markov processes (diffusion, jump, branching processes), Gibbs measures, and random graphs.
On these topics, the team has an intensive research experience from different perspectives.
Finding the balance between usability, interpretability and realism is our first guide. This is the keystone in modeling, and the main difference with black-box approaches in machine learning. Our second guide is to study the related mathematical issues in modeling, simulation and inference. Models are sources of interesting open mathematical questions. We are eager to expand the “capacity” of the models by exploring their mathematical properties, providing simulation algorithms or proposing more efficient ones, as well as new inference procedures with statistical guarantees.
To study and apply the class of stochastic models we have to handle the following questions:
Our main application domains are: economy, geophysics, medicine, astronomy and digital humanities.
We aim at providing new tools regarding the modeling, simulation and inference of spatio-temporal stochastic processes and other dynamical random systems living in large state spaces. As such, there are many application domains which we consider.
In particular, we have partnerships with practitioners in: cosmology, geophysics, healthcare systems, insurance, and telecom networks.
We detail below our actions in the most representative application domains.
Geophysics is a domain which requires the application of a broad range of mathematical tools related to probability and statistics while more and more data are collected. There are several directions in which we develop our methodology in relation with practitioners in the field.
On such topics, we hold long standing interdisciplinary collaborations with INRAE Grenoble, the RING Team (GeoRessources, Université de Lorraine), IMAR (Institute of Mathematics of the Romanian Academy) in Bucharest.
We have longstanding and continuous cooperation with astronomers and cosmologists in France, Spain and Estonia. In particular, we are interested in using spatial statistics tools to detect galaxies and other star patterns such as filaments detection. Such developments require us to design specific point processes giving appropriate morpho-statistical distributions, as well as specific inference algorithms which are based on Monte Carlo simulations and able to handle the large volume of data.
Graphs are essential to model complex systems such as the relations between agents, the spatial distribution of points that are connected such as stars, the connections in telecommunication networks, and so on. We develop various directions of the study of random graphs that are motivated by a large class of applications:
We have longstanding collaborations on these topics with Agence de Biomédecine (ABM), Le Foyer (insurance company, Luxembourg), INRAE (Avignon), Dyogene (Inria Paris), Lip 6, UTC, LORIA (computer science laboratory, Nancy), University of Buenos Aires, Northwestern University and LAAS (CNRS, Toulouse).
Digital Humanities represents an interdisciplinary field of research.
We are interested in developing
suitable, automatic tools to help experts to study the ideas
contained in antique texts. Together with historians of antiquity, we consider
one of the founding texts of political sciences, the Politics of
Aristotle. To fulfill our purposes, we consider techniques both
from the history of antiquity, machine learning, and statistics. This also
presents some technological challenges to develop suitable tools
to load and manipulate the data.
This research is supported by the Inria Exploratory Reasearch Action Apollon and involves collaboration with researchers from Archimède (Universities of Strasbourg and Haute-Alsace), IRIMAS and CRESAT (Université de Haute-Alsace) and University of Pavia.
Madalina Deaconu was plenary speaker at the 14th International Conference on Monte Carlo Methods and Applications held in Paris in June 2023.
Our survey article 16, published in Probability Surveys, is one of the first works introducing a large spectrum of stochastic models that can be used while analysing the fragmentation phenomena. It gives also a simple and efficient numerical algorithm to simulate these processes.
We have a strong interest in the fragmentation equation for understanding snow or rock avalanches. Our point of view is to explore the probabilistic representations of transport equations in this framework as well as the possibilities they offer. We developed a new stochastic process that represents the typical evolution of the mass of a rock or of a snow aggregate subject to successive random breakages.
In a survey article 16, we present various probabilistic representations of the fragmentation equation, and show how they are connected. We focus on the stochastic process which represents the evolution of the mass of a typical particle subject to a fragmentation process. These probabilistic representations range from Markov chains to Stochastic Differential Equations with jumps. In particular, we show how these representations lead to easy numerical simulations.
Further, with Gaetano Agazzotti (former intern in the team), we have studied from an analytic viewpoint the evolution of the moments of a fragmentation equation, and obtain its asymptotic behavior, by expanding his Master thesis 37.
We have also explored, using machine learning techniques, the median size of rocks in blasting operations. In particular, we have used ensemble techniques to combine various approaches to predict the median size 36.
The numerical approximation of stochastic differential equations (SDEs) and in particular new methodologies to approximate hitting times of SDEs is a challenging problem which is important for a large class of practical issues such as: geophysics, finance, insurance, biology, etc.
With Samuel Herrmann (University of Burgundy) we made important progress on this topic by developing new methods. One main result concerns a new technique for the path approximation of one-dimensional stochastic processes 15. Our method applies to the Brownian motion and to some families of stochastic differential equations whose distributions could be represented as a function of a time-changed Brownian motion (usually known as
We develop also new techniques for the path approximation of Bessel processes of arbitrary dimension, as such a process represents the norm of a multi-dimensional Brownian motion 14. Our approach constructs jointly the sequences of exit times and corresponding exit positions of some well-chosen domains, the construction of these domains being an important step. We construct the algorithm for any dimension and treat separately the integer dimension case and the non integer framework, each situation requiring appropriate techniques. We prove the convergence of the scheme and provide the control of the efficiency with respect to the parameter ε. We expand the theoretical part by a series of numerical developments.
Together with Samuel Herrmann (University of Burgundy) and Cristina Zucca (University of Torino) we pursued our work on the exact simulation of the hitting times of multi-dimensional diffusions. A one-week workshop meeting was held in Torino in November.
In collaboration with Benoit Nieto (École Centrale Lyon) we consider several-regimes CKLS (Chan– Karolyi–Longstaff–Sanders) dynamics (including Cox-Ingersoll-Ross model) and we study parameter estimation from high-frequency observations, extending the published work 20 about threshold Vasicek model. The model fits well the behavior in financial markets related to crisis periods.
In a collaboration with Paolo Pigato (University Tor Vergata, Roma), we study new estimators from low frequency observations for the parameters of several regimes threshold models which show mean-reversions features.
Together with Alexis Anagnostakis (LJK Grenoble), we are extending our respective results on high-frequency approximation of the local time of sticky-oscillating-skew diffusion processes. The purpose is to estimate the parameters of stickiness and/or skewness and to model some critical behaviors in financial markets related to crisis periods. Inspired by a previous work 43, we extend the results obtained during the PhD thesis of Alexis Anagnostakis 38 in the context of sticky Brownian motion to more general estimators of local time and to oscillating-skew-sticky Brownian motion. This is a work on its final stage of editing. Our main goal is now to reach rates of convergence for sticky diffusions and so extend the results in 43.
We are continuing our work on an expansion of the maximum likelihood estimator using formal series expansions 41. The aim of this work is to understand the lack of Gaussianity in the non-asymptotic regime.
In 19, we apply this expansion to the estimator of the skewness parameter of a skew Brownian motion, whose asymptotic mixed normality is also proved with a rate of convergence of order
With Géraldine Pichot (Serena, Inria Paris), Giovanni Michele Porta and Elisa Baioni (Politecnico di Milano), we have provided an extension of a Monte Carlo method that allows for the simulation of a diffusion process in a one-dimensional discontinuous media. Using the method of images, the extension consists in finding an approximation of the fundamental solution associated with the process which is suitable for a fast simulation. Our method may be applied to situations in which both the solution and its gradient are discontinuous at some point. In particular, we may consider the case of the Fourier equation with discontinuous coefficients 26, 27.
Together with Alexis Anagnostakis and Pierre Etoré (LJK Grenoble) we are dealing with different questions about the non-uniqueness of solutions for processes solution to stochastic differential equations with a diffusion coefficient admitting jumps and becoming negative. We tackle a conjecture open since the 80's. We have obtained a partial answer and we are seeking for the link with sticky-skew diffusions.
Studying geological fluids mixing systems allows us to understand the interaction between water sources. The Hug model is an interaction point process model that can be used to estimate the number and the chemical composition of the water sources involved in a geological fluids mixing system from the chemical composition of samples 44. In 24, we construct priors for the parameters of the Hug model using the ABC shadows algorithm 45. The long term perspective of this work is to integrate geological expertise within fully unsupervised models.
This work is a collaboration with Didier Gemmerlé (IECL, Université de Lorraine) and Antonin Richard (GeoRessources, Université de Lorraine).
Marked point processes and Bayesian inference are powerful tools for analysing spatial data. With respect to a previous work 40, we proposed a new inhomogeneous point process with superposed interaction. The results indicate a correct fit of the model and allow the study of the significance of the parameter at the corresponding prefixed interaction ranges 23. This work is a collaboration with Didier Gemmerlé (IECL, Université de Lorraine).
With Jenny Sorce (Cristal, Lille) and Elmo Tempel (Tartu Observatory, University of Tartu), we develop in 21, 25 a new algorithm based on an object point process model that can reduce biases and uncertainties in the measurement of peculiar velocities of galaxies. The algorithm uses simulated annealing to maximize the probability density of the point process model, resulting in bias-minimized catalogs. We conducted tests on synthetic catalogs mimicking the second and third distance modulus catalogs of the Cosmicflows project from which peculiar velocity catalogs are derived. By reducing the local peculiar velocity variance in catalogs by an order of magnitude, the algorithm permits the recovery of the expected one, while preserving the small-scale velocity correlation. The expected clustering was also retrieved. The resulting bias-minimized catalogs allowed for the recovery of expected statistical properties and the reconstruction of large-scale structures that matched well with existing redshift surveys of local galaxies. These bias-minimized catalogs can be used for various cosmological studies and simulations of the local Universe.
Faults are crucial subsurface features that significantly influence the mechanical behavior and hydraulic properties of rock masses. Interpreting them from seismic data may lead to various scenarios due to uncertainties arising from limited seismic bandwidth and possible imaging errors. Only a few methods addressing fault uncertainties can produce curved and sub-seismic faults at once while quantitatively honoring seismic images and avoiding anchoring in a reference interpretation.
With Fabrice Taty-Moukati, François Bonneau, Guillaume Caumon (GeoRessources, Université de Lorraine) and Xinming Wu (Hefei, University of Science and Technology of China), we use a mathematical framework, namely the Candy model, of marked point processes with interactions to approximate fault networks in two dimensions with a set of line segments. The novelty of this approach lies in using the input image of fault probabilities computed by a Convolutional Neural Network (CNN). The Metropolis-Hastings algorithm is used to generate various scenarios of fault network configurations, thereby exploring the model space and reflecting the uncertainty. The empty space function produces a ranking of the generated fault networks against an existing interpretation by testing and quantifying their spatial variability. The approach is applied on two-dimensional sections of seismic data, acquired in the Central North Sea.
During geological exploration, the interpretation of faults can be ambiguous and uncertain because of disparate and often sparse observations such as fault traces on 2D seismic images or outcrops. With Amandine Fratani, Guillaume Caumon and Jéremie Guirad (GeoRessources, Université de Lorraine), we propose a hypergraph formalism to generalize the higher-order interactions between fault observations in the multiple-point association problem, by dealing with the likelihood of multiple-point fault data association. A machine learning approach is proposed to enhance or replace the expert geological rules in determining the likelihood of multiple-point fault data association. This involves training a supervised classifier using fault features extracted from known 3D geological models. By formulating the problem as a classification task, the model can determine the probability that fault observations belong to the same fault objects. We also develop a specific technique to prevent overfitting. Overall, this hypergraph-based approach with machine learning integration provides a more advanced and flexible method for interpreting and associating fault observations in geological exploration.
Fractures from systems of complex mechanical discontinuities dramatically impact the physical behavior of rock masses. With François Bonneau and Guillaume Caumon (GeoRessources, Université de Lorraine), we use the mathematical framework of marked point processes to approximate fracture networks in two dimensions with a collection of straight-line segments. Whereas most fracture characterization and modeling focuses on first order statistics (density and mark distribution), and assume independent fractures, we focus on fracture interactions by providing stochastic mathematical models involving simple pairwise interactions between fractures to capture key aspects of fracture network geometry and organization. The model is calibrated using a maximum likelihood, which we apply to real data from Oman mountains.
With Lucian Beznea (IMAR, Bucharest) and Oana Lupaşcu-Stamate (Institute of Mathematical Statistics and Applied Mathematics, Bucharest) we are developing a stochastic approach for the two-dimensional Navier-Stokes equation in a bounded domain. More precisely we consider the vorticity equation and construct a specific non-local branching process. This approach is new and can conduct to important advances as it will also results in a new numerical algorithm if successful.
In particular, we obtained several results concerning the construction of a duality - time reversal process and also in the development of a numerical algorithm with a non-local branching process involving the creation and disappearance of particles that mimic the physics of the vorticity in the boundary layer.
We have made various advances in the analysis and optimization of stochastic matching models:
We have focused on the study and optimization of online algorithms on large random graphs, which are known to be prevalent in practice, for various applications such as job/housing allocation, online adverts, and so on.
In 13, in collaboration with Eustache Besançon (Telecom Paris), Laurent Decreusefond (Telecom Paris) and Laure Coutin (Université Paul Sabatier, Toulouse), we have shown universal bounds for the speed of convergence in the functional Central Limit Theorems for Lipschitz continuous functionals of Poisson random measures. As a by-product, we deduce similar bounds for Continuous Time Markov Chains (CTMCs), by using various tools of stochastic analysis, among which, Malliavin calculus for point processes, and the Stein method in infinite dimension. These results allow us to characterize the accuracy (and thereby the confidence interval) in diffusion approximations of many practical processes appearing in epidemiology (Susceptible, Infectious, or Recovered [SIR] processes) biology of development (Moran process) and telecom networks (queueing processes and the Telegraph process).
Madalina Deaconu is Deputy Head of Science of Inria Centre at Université de Lorraine since January 2022. She is also, at the national level, member of the Evaluation Commission of Inria.
She is also member of Bureau du Comité de Projets and Comité des Projets of Inria Centre at Université de Lorraine.
Nathan Gillot is the organizer of the PhD Student Seminar of Institut Élie Cartan de Lorraine (IECL).
He is also the PHD reprensentative for the library committee of Institut Élie Cartan de Lorraine (IECL).
Antoine Lejay is a member of the board the AMIES.
He is also the Head of the Fédération Charles Hermite for 2023,
a federation of research within CNRS and Université of Lorraine,
gathering three research laboratories: CRAN (control theory), IECL (mathematics) and LORIA
(computer science) with the goal of creating interdisciplinary projects.
He is also co-head of the COMIPERS, which is the local hiring committee for PhD and post-doctoral students at Centre Inria de l'Université de Lorraine.
Sara Mazzonetto is member of the Committee of Equal Opportunities of IECL.
She was also a member of the Hiring Committee for an assistant professor position, Université Claude Bernard, Lyon, 2023.
Pascal Moyal is Head of the Probability and Statistics team at the Institut Élie Cartan de Lorraine (IECL) (2022-). As such, he is member of the Laboratory Council of IECL.
He is also the Head of the Master 2 Ingénierie Mathématique et Sciences des Données at Université de Lorraine.
Sara Mazzonetto is assistant professor, Pascal Moyal and Radu Stoica are professors. They have full teaching duties with lectures at all the levels of the university. For them, we mention here only lectures at Master 1 and Master 2 levels as well as responsibilities.