SeMaFoR - Self-Management of Fog Resources with Collaborative Decentralized Controllers

STACK Software Stack for Massively Geo-Distributed Infrastructures

Distributed Systems and middleware

Networks, Systems and Services, Distributed Computing

http://team.inria.fr/stack Laboratoire des Sciences du numerique de Nantes Ecole Nationale Supérieure Mines-Télécom Atlantique Bretagne Pays de la Loire, Nantes Université Project-Team A1.1.8. - Security of architectures A1.1.10. - Reconfigurable architectures A1.1.13. - Virtualization A1.2.1. - Dynamic reconfiguration A1.2.2. - Supervision A1.2.4. - QoS, performance evaluation A1.2.5. - Internet of things A1.2.8. - Network security A1.3.4. - Peer to peer A1.3.5. - Cloud A1.3.6. - Fog, Edge A1.5.1. - Systems of systems A1.6. - Green Computing A2.1.7. - Distributed programming A2.1.10. - Domain-specific languages A2.5.2. - Component-based Design A2.6. - Infrastructure software A2.6.1. - Operating systems A2.6.2. - Middleware A2.6.3. - Virtual machines A2.6.4. - Ressource management A3.1.2. - Data management, quering and storage A3.1.3. - Distributed data A3.1.8. - Big data (production, storage, transfer) A4.1. - Threat analysis A4.4. - Security of equipment and software A4.9. - Security supervision B2. - Health B4. - Energy B4.5.1. - Green computing B5.1. - Factory of the future B6.3. - Network functions B6.4. - Internet of things B6.5. - Information systems B7. - Transport and logistics B8. - Smart Cities and Territories Inria Centre at Rennes University Adrien Lebre Chercheur Team leader, INRIA, Senior Researcher, from Dec 2023 oui Daniel Balouek-Thomert Chercheur INRIA, ISFP, from Feb 2023 Hélène Coullon Enseignant IMT ATLANTIQUE, Associate Professor Mohamed Graiet Enseignant IMT ATLANTIQUE, ATER Remous-Aris Koutsiamanis Enseignant IMT ATLANTIQUE, Associate Professor Thomas Ledoux Enseignant IMT ATLANTIQUE, Professor oui Adrien Lèbre Enseignant IMT ATLANTIQUE, Professor, until Nov 2023 oui Jean-Marc Menaud Enseignant IMT ATLANTIQUE, Professor oui Jacques Noyé Enseignant IMT ATLANTIQUE, Associate Professor Kandaraj Piamrat Enseignant NANTES UNIVERSITE, Associate Professor Mario Südholt Enseignant IMT ATLANTIQUE, Professor oui Ons Aouedi PostDoc IMT ATLANTIQUE, until Nov 2023 Farid Arfi PostDoc IMT ATLANTIQUE, from Feb 2023 Alexis Bitaillou PostDoc IMT ATLANTIQUE, until Jan 2023 Yasmina Bouizem PostDoc INRIA, Post-Doctoral Fellow Marie Delavergne PostDoc IMT ATLANTIQUE, from Apr 2023 Tayeb Diab PostDoc IMT ATLANTIQUE Jabran Khan PostDoc IMT ATLANTIQUE, from Feb 2023 Clément Mommessin PostDoc IMT ATLANTIQUE Eloi Perdereau PostDoc IMT ATLANTIQUE, Post-Doctoral Fellow Jolan Philippe PostDoc IMT ATLANTIQUE, Post-Doctoral Fellow, from Feb 2023 Tengfei An PhD INRIA, from Nov 2023 Geo Johns Antony PhD INRIA Hiba Awad PhD SMILE, CIFRE Samia Boutalbi PhD ERICSSON, CIFRE Divi De Lacour PhD ORANGE, CIFRE Dan Freeman Mahoro PhD ORANGE, CIFRE, until Oct 2023 Duc Thinh Ngo PhD ORANGE, CIFRE Antoine Omond PhD IMT ATLANTIQUE Abdou Seck PhD IMT ATLANTIQUE Cherif Si Mohammed PhD ADEME, from Dec 2023 Baptiste Jonglez Technique INRIA, Engineer Sidi Mohammed Kaddour Technique IMT ATLANTIQUE, Engineer, from Dec 2023 Anas Mokhtari Technique IMT ATLANTIQUE, Engineer, from Sep 2023 Matthieu Rakotojaona Rainimangavelo Technique IMT ATLANTIQUE, Engineer, from May 2023 Mohamed Khaled Ben Mna Stagiaire LS2N, from Mar 2023 until Aug 2023, Anne-Claire Binetruy Assistant INRIA Overall objectives STACK in a Nutshell

The STACK team addresses challenges related to the management and advanced usages of the Cloud to IoT continuum (infrastructures on the Cloud, Fog, Edge, and IoT). More specifically, the team is interested in delivering appropriate system abstractions to operate and use massively geo-distributed infrastructures, from the lowest to the highest levels of abstraction (i.e. system to application development, respectively), and addressing crosscutting dimensions such as energy or security. These infrastructures are critical for the emergence of new kinds of applications related to the digitalization of the industry and the public sector (a.k.a. the Industrial Internet, smart cities, e-medecine, etc. ).

Context & Objectives

Initially proposed to interconnect computers worldwide, the Internet has significantly evolved to become in two decades a key element in almost all our activities. This (r)evolution mainly relies on the progress that has been achieved in the computation and communication fields which in turn has led to the well-known and widely spread Cloud Computing paradigm. Nowadays most Internet exchanges occur between endpoints ranging from small-scale devices, such as smart-phones, to large-scale facilities, i.e., cloud computing platforms, in charge of hosting modern information systems.

With the emergence of the Internet of Things (IoT), stakeholders expect a new revolution that will push, once again, the limits of the Internet, in particular by favouring the convergence between physical and virtual worlds into an augmented world or cyber-physical world. This convergence is about to be made possible thanks to the development of minimalist sensors as well as complex industrial physical machines that can be connected to the Internet through edge computing infrastructures. Edge computing is an extension of the cloud computing model that consists in deploying a federation (or cooperation) of smaller data centers at the edge of the network, thus closer to sensors, devices, machines, and end-users that produce and consume data 76, 74. This new kind of digital infrastructure, which covers resources from the “center” to the extreme edge of the network, is expected to improve almost all aspects of daily life and the decision processes in various domains such as industry, transportation, health, training and education. The corresponding applications target the control and optimization of the business processes of most companies thanks to the intensive use of ICT systems and real-time data collected by geo-distributed physical devices (video, sensors, ...).

Among the obstacles to this new generation of Internet services is the development of a convenient and powerful software stack, i.e. , a set of system mechanisms and software abstractions capable of operating and exposing a significant number of diverse computational resources in a unified, efficient and sustainable way.

In other words, this framework should allow operators, and devops, to manage the life-cycle of both the digital infrastructure and the applications deployed on top of this infrastructure, throughout the cloud to IoT continuum. These include operations such as the initial configuration but also all the reconfigurations that can be required in response to particular events (maintenance operations, equipment failures, application load variation, user mobility, energy shortage, etc.).

The existing software stacks that have been proposed to manage Cloud Computing platforms are not appropriate for handling the specifics of the next generation of digital infrastructure (in terms of scale, heterogeneity, dynamicity, security threats, and energy opportunities). For example, this infrastructure have to be operated remotely and automatically as much as possible as it will impossible to have human presence on all locations. Due to their number, it will be necessary to allow operations not on a single site but on sets defined on the fly as needed. Moreover, the management mechanisms must have been designed to cope with intermittent network access to the sites. That is to say, offering on the one hand safety properties and on the other hand autonomy in order to allow each site to remain as operational as possible in the event of network partitioning. Finally, currently existing interfaces (APIs) should be extended to turn location into a first-class citizen. In particular, the locality aspects should be reified from the core system building blocks to the high-level application programming interfaces.

The STACK activities cover the full Cloud to IoT continuum, including recent challenges related to the network dimension and urgent computing. Starting with Ass. Prof Koutsiamanis, Ass. Prof Piamrat and Dr. Balouek (who respectively joined the team in 2021, 2022 and 2023), this enlargement of STACK core activities will be further strengthened with the arrival of Orange members expected in 2024.

Scientific Foundations

Through the ongoing integration of Orange members, STACK consolidates its expertise in distributed systems, networks, cyber-physical systems, IoT, device management, and software programming as well as combining significant skills in the design, practical development and evaluation of large-scale systems. More precisely, our research activities mainly rely on a set of scientific foundations detailed below.

(Distributed) Systems. The first scientific foundation of the team is related to our strong expertise in resource management and capacity planning of large-scale infrastructure 60, 70, 64, 78, 73, 72. This includes the design and evaluation of system mechanisms and algorithms to operate and use computation, network, storage, and IoT resources in an efficient and sustainable manner. Our knowledge is based on traditional as well as distributed system fundamentals, covering virtualization technologies, storage, security, energy, and distributed/parallel algorithms.

Networks. Another set of expertise in the team concerns network related topics. This includes intelligent analysis and management in wireless and mobile networks using artificial intelligence and machine learning techniques such as approaches based on supervised/semi-supervised/non-supervised and federated learning 7, 31, 32, 11. It also includes the optimisation of wireless low-power and lossy networks (LLN), typically wireless Industrial IoT networks, through energy-aware network resource and communications scheduling and routing 63, 61, 40, 59, 43.

Cyber-Physical Systems, Digital Twins, IoT, Device Management Based on initial expertise in Device Management (DM) and IoT platforms for the management of connected devices and sensor data, and especially on distributed and autonomic architectures of such platforms 71, 34, 33, the team has developed a broader vision of cyber-physical systems with a strong expertise in digital twins as a pivotal technology. This includes graph-based modelling of cyber-physical systems 69, semantic modelling and ontology mapping 35, graph storage distribution and historisation 57, 58 — and the application of these concepts and technologies in different use cases in the domain of smart building (e.g., localisation 43, dynamic wireless IoT network resource allocation 59), smart industry (e.g., support for reliable and low-latency wireless Industrial IoT networks 63, 61), logistics 68 around the Thing'in digital twin platform 48 (Thing'in).

Autonomic and Self-Adaptive Systems. Considering the high (and ever increasing complexity) of ICT systems, autonomic and self-adaptive policies have become the de facto standard for designing and building large-scale systems. This second family includes, for example, research approaches that have been harnessed to tackle system modularity, configuration and reconfiguration of dynamic and distributed systems, as well as retroaction and autonomic loops. All these concepts enable administrators and developers to deal with various objectives such as performance, high availability, low energy consumption, etc. STACK members have provided several relevant contributions in the last couple of years 51, 52, 67, 53, 66, 39, 41, 42.

Software Engineering and Programming. Similarly, software engineering and advances in programming are highly valuable to correctly design complex systems such as the software stack we target. Leveraging the expertise of software programming of the team, STACK contributions leverage various techniques including component-based programming models 49, 38, 50, 65, 47, 9, event-driven 56, 77, data-driven and workflow models, as well as models for Utility Computing (Service Level Agreement, aka. SLA) 75, and more generally, distributed and parallel programming models.

Experiment-Driven Research. Finally, the last important domain of expertise of the future team consists in the evaluation of complex software stacks at large scale through simulations and in-vivo experiments. This includes knowledge on experimental methodology, measuring/monitoring/tracing tools 36 and more recently aspects related to software-defined experiments and reproducible research 46, 44. Team members are also in charge of the animation of the LASCARE working group (LArge SCale ARchitecture Experimentation and Simulation) of the IOLab.

We aim at strengthening the knowledge in these different areas through two kinds of contributions: First through scientific articles as a regular project team, and second, through concrete pieces of software that can be transferred to major opensource communities.

Research program Overview

STACK activities have been focused on the management and programming of geo-distributed data centers with a work program defined around four research topics as depicted in Figure 1a. The first two ones are related to the resource management mechanisms and the programming support that are mandatory to operate and use ICT geo-distributed resources (compute, storage, network). They are transversal to the three software layers that generally compose a software stack (System/Middleware/Application in Figure 1b) and nurture each other (i.e., the resource management mechanisms will leverage abstractions/concepts proposed by the programming support axis and reciprocally). The third and fourth research topics are related to the Energy and Security dimensions (both also crosscutting the three software layers). Although they could have been merged with the first two axes, we identified them as independent research directions due to their critical aspects with respect to the societal challenges they represent.

This scientific roadmap to address challenges related to the management and programming of geo-distributed infrastructures applies to the Cloud to IoT continuum and continues to have significant scientific and socio-economic impact. Hence, STACK organizes its activities around these four crosscutting lines that form a unique approach.

Additionally, our activities extend to the management of IoT devices with the ultimate goal of covering the entire Cloud-to-IoT continuum through a common software stack.

Our vision is to base this computing continuum software stack on control loops following the MAPE-K model1, which can be seen as an infinite loop that monitors the infrastructure as well as the state of the applications in order to maintain in an autonomous manner the expected objectives (in terms of performance, robustness, etc.).

Although it is largely adopted in Cloud orchestrators such as Kubernetes, delivering a MAPE-K software stack for the computing continuum faces multiple challenges. The first one is related to the diversity of resources to consider. KubeEdge 79, for instance, proposes extensions to integrate servers and IoT devices under the same framework. However, the supported operations are rather limited as they only cover communication between software components running on servers at the edge and the connected devices. In other words, the IoT devices are not considered in the control loops. From our viewpoint, this weak integration is linked to an incomplete understanding of the needs that such a system must take into account (in particular on the IoT device management side). To favor the integration of both dimensions into a common system, we aim to identify major structural properties as well as management operations necessary at the operator and DevOps levels and to implement them when needed. A second important challenge is related to the geo-distribution property (and so the intermittent network connectivity) of this type of infrastructure. This increased complexity implies revising the way control loops are designed in order to handle frequent disconnections that can occur at any time (for instance due to the low energy level of IoT devices). Here, our approach is to combine autonomous loops with well-adapted formal methods to guarantee their verification or to synthesize correct-by-design decisions. Additionally, we study their performance models to be able to automatically, safely, efficiently and in a timely manner adapt Cloud-to-IoT infrastructures and their hosted applications according to different objectives (performance, energy, security, etc.). Finally, we are working towards partitioning a Cloud-to-IoT infrastructure into several areas and delivering the illusion of a single system through a federated approach: each area is managed by an independent controller, and collaborations between areas are done through dedicated middleware. The innovative aspect relies in the way of developing this middleware so that it is reliable in spite of increasing scale and faults (collaborations will be triggered only on-demand without maintaining a global knowledge base of the entire infrastructure).

Some of the research questions we address in the medium term are:

How to specify and model the dynamics of Cloud-to-IoT infrastructures and the dynamics of associated systems and applications in a generic way and how to leverage this dynamics for reconfiguration purposes? In particular, this is done though studying how existing languages such as SysML, ThingML or TOSCA may be revised for this purpose. In addition, we focus on the exploration of the properties of safety, separation of concerns and efficiency when reconfiguring distributed systems with Concerto 42 extended to IoT devices and to network resources. Finally, we address the dynamicity of systems described by such Architecture Description-like languages (ADL) considering a convergence between the Model@Runtime and Digital Twins approaches, i.e. implementing models@runtime as digital twins.

How to design and deploy decentralized autonomic loops, from monitoring to the execution of reconfiguration plans? An important challenge is related to the development of mechanisms capable of rebuilding, on-demand, a knowledge base according to the functional and non-functional properties to be satisfied. From our viewpoint, it is crucial to propose alternative approaches to avoid maintaining such a global knowledge base through time and at the scale of a Cloud-to-IoT infrastructure. Regarding the monitoring/supervision of the infrastructure, we study the latest results on complex event processing as well as machine learning techniques. The former enables the triggering of actions based on predefined events while the latter allow the management to evolve from reactive to predictive strategies. Regarding the analysis and planning phases we study how to combine and leverage satisfiability solvers (SMT and SAT solvers), constraint programming, and distributed algorithms.

How to handle in an easy and non-intrusive manner the geo-distribution of complex legacy software stacks to avoid re-implementing from scratch large open source projects such as OpenStack or Kubernetes? In particular, we are pushing our recent proposal 45 further to handle geo-distribution concerns through a dedicated service mesh.

How to increase the responsiveness of data analysis algorithms and accelerate responses of AI-enabled scenarios across the Cloud-to-IoT continuum? Another important challenge is related to the steering of computation considering data and events measured by the IoT infrastructure, coupled with historical information and the Quality of Service (QoS) needed. We investigate application formulations that allow developers to balance requirements and costs, along with programming abstractions to define policies that can react to unforeseen events.

On the energy dimension, the main questions are related to the generalization of the usage of renewable energies in the Cloud-to-IoT continuum while guaranteeing availability and reliability properties. We investigate, in particular, whether energy harvesting devices could be used at the extreme edge and how they complicate the placement challenge that we largely studied in a multi-cloud context. Besides, we are working on extending our work to include green energy awareness for users (e.g., DevOps engineers, web application end-users, etc.).

Finally, on the security side, we investigate the new threats resulting from an externalized management of geo-distribution. This includes, in particular, the identification of new possible attack channels as well as counter measures to guarantee a satisfactory level of security through the whole continuum. Furthermore, we are making efforts to extend our work on kernel security policies 37 in order to also take into account the network dimension and ensure strong isolation from Cloud/Edge servers to IoT devices.

All the aforementioned research questions are addressed through several application fields: telecommunications operators and smart buildings in the first place through this privileged partnership with the Orange colleagues who are going to join this new team but also in health, in particular, biomedical research in order to allow the execution of analyses, currently emerging, in large-scale geo-distributed environments.

Application domains

Industrial/Tactile Internet/Cyber-Physical applications highlight the importance of the computing continuum model. Hence, the use-cases of STACK activities are driven and nurtured by these application domains. Besides, it is noteworthy to mention that Telecom operators such as Orange have been among the first ones to advocate the deployment of Fog/Edge infrastructure. The initial reason is that a geo-distributed infrastructure enables them to virtualize a large part of their resources and thus reduce capital and operational costs. As an example, several researchers have been investigating through the IOLab, the joint lab between Orange and Inria, how 5G networks can be managed. We highlight that while our expertise does partially include the network side, the main focus is rather on how we can deploy, locate and reconfigure the software components that are mandatory to operate next generation of network/computing infrastructure. The main challenges are related to the high dynamicity of the infrastructure, the way of defining Quality of Service of applications and how it can be guaranteed. We expect our contributions will deliver advances in location based services, optimized local content distribution (data-caching) and Mobile Edge Computing 2. In addition to bringing resources close to end-users, massively geo-distributed infrastructures should favor the development of more advanced networks as well as mobile services.

Overview

Supporting industrial actors and open-source communities in building an advanced software management stack is a key element to favor the advent of new kinds of information systems as well as web applications. Augmented reality, telemedecine and e-health services, smart-city, smart-factory, smart-transportation and remote security applications are under investigations. Although, STACK does not intend to address directly the development of such applications, understanding their requirements is critical to identify how the next generation of ICT infrastructure should evolve and what are the appropriate software abstractions for operators, developers and end-users. STACK team members have been exchanging since 2015 with a number of industrial groups (notably Orange Labs and Airbus), a few medical institutes (public and private ones) and several telecommunication operators in order to identify both opportunities and challenges in each of these domains, described hereafter.

Industrial Internet

The Industrial Internet domain gathers applications related to the convergence between the physical and the virtual world. This convergence has been made possible by the development of small, lightweight and cheap sensors as well as complex industrial physical machines that can be connected to the Internet. It is expected to improve most processes of daily life and decision processes in all societal domains, affecting all corresponding actors, be they individuals and user groups, large companies, SMEs or public institutions. The corresponding applications cover: the improvement of business processes of companies and the management of institutions (e.g., accounting, marketing, cloud manufacturing, etc.); the development of large “smart” applications handling large amounts of geo-distributed data and a large set of resources (video analytics, augmented reality, etc.); the advent of future medical prevention and treatment techniques thanks to the intensive use of ICT systems, etc. We expect our contributions favor the rise of efficient, correct and sustainable massively geo-distributed infrastructure that are mandatory to design and develop such applications.

Internet of Skills

The Internet of Skills is an extension of the Industrial Internet to human activities. It can be seen as the ability to deliver physical experiences remotely (i.e., via the Tactile Internet). Its main supporters advocate that it will revolutionize the way we teach, learn, and interact with pervasive resources. As most applications of the Internet of Skills are related to real time experiences, latency may be even more critical than for the Industrial Internet and raise the locality of computations and resources as a priority. In addition to identifying how an Utility Computing infrastructure can cope with this requirement, it is important to determine how the quality of service of such applications should be defined and how latency and bandwidth constraints can be guaranteed at the infrastructure level.

e-Health

The e-Health domain constitutes an important societal application domain of the two previous areas. The STACK teams is investigating distribution, security and privacy issues in the fields of systems and personalized (aka. precision) medicine. The overall goal in these fields is the development of medication and treatment methods that are tailored towards small groups or even individual patients.

We have been working on different projects since the beginning of STACK (e.g., PrivGen CominLabs). In general, we are applying and developing corresponding techniques for the medical domains of genomics, immunobiology and transplantalogy (see Section 10).

The STACK team continue to contribute to the e-Health domain by harnessing advanced architectures, applications and infrastructure for the Fog/Edge.

Network Virtualization and Mobile Edge Services

Telecom operators have been among the first to advocate the deployment of massively geo-distributed infrastructure, in particular through working groups such as the Mobile Edge Computing at the European Telecommunication Standards Institute. The initial reason is that a geo-distributed infrastructure enables Telecom operators to virtualize a large part of their resources and thus reduces capital and operational costs. As an example, we are investigating through the I/O Lab, the joint lab between Orange and Inria, how can a Centralized Radio Access Networks (a.k.a. C-RAN or Cloud-RAN) be supported for 5G networks. We highlight that our expertise is not on the network side but rather on where and how we can deploy, allocate and reconfigure software components, which are mandatory to operate a C-RAN infrastructure, in order to guarantee the quality of service expected by the end-users. Finally, working with actors from the network community is a valuable advantage for a distributed system research group such as STACK. Indeed, achievements made within one of the two communities serve the other.

Urgent Computing

Urgent Computing refers to a class of time-critical scientific applications that leverage distributed data sources to facilitate important decision-making in a timely manner. The overall goal of Urgent Computing is to predict the outcome of scenarios early enough to prevent critical situations or to mitigate their negative consequences. Motivating use cases refers to rapid response scenarios across the Cloud-to-IoT Continuum, such as in natural disaster management, which implies to gather the local state of each device, transform it into a global knowledge of the network, characterize the observed phenomenon according to an applied model, and finally, trigger appropriate actions. The STACK team investigates Urgent Computing through the realization of a fluid ecosystem where distributed computing resources and services are aggregated on-demand to support delay-sensitive and data-driven workflows.

Social and environmental responsibility Footprint of research activities

In addition to the international travels, the environmental footprint of our research activities is linked to our intensive use of large-scale testbeds such as Grid'5000 (STACK members are often in the top 10 list of the largest consumers). Although the access to such facilities is critical to move forward in our research roadmap, it is important to recognize that they have a strong environmental impact as decribed in the next paragraph.

Impact of research results

The environmental impact of digital technology is a major scientific and societal challenge. Even though the software looks to be virtual objects, it is executed on very real hardware contributing to the carbon footprint. This impact materializes during the manufacturing and destruction of hardware infrastructure (estimated at 45% of digital consumption in 2018 by the Shift Project) and during the software use phase via terminals, networks and data centers (estimated at 55%). Stack members have been studying various approaches for several years to reduce the energy footprint of digital infrastructures during the use phase. The work carried out revolves around two main axes: (i) reducing the energy footprint of infrastructures and (ii) adapting the software applications hosted by these infrastructures according to the energy available. More precisely, this second axis investigates possible improvements that could be made by the end-users of the software themselves. At scale, involving end-users in decision-making processes concerning energy consumption would lead to more frugal Cloud computing.

In 2023, the team has taken part in two Inria Challenges:

The first one is built around a partnership between Inria and OVHCloud. It aims to study end-to-end eco-design of a cloud in order to reduce its environmental impact.

The second one involved Inria and Qarnot Computing, with the support of ADEME. Entitled Pulse, this challenge aims to develop and promote best practices in terms of reducing and recycling emissions of intensive computing infrastructures.

Last but not least, STACK started in 2022 a new platform project to design an innovative hardware infrastructure for the scientific study of the cross-cutting issues of computing infrastructures supporting artificial intelligence and their energy autonomy (see the Samurai project Section in 10).

Highlights of the year

Regarding scientific contributions, the team has produced major results on the management of large-scale infrastructures, in particular one publication in the prestigious journal ACM Computing Surveys 9. The team also continued to consolidate its presence in the network community (e.g. in Elsevier Computer Networks 11)

On the software side, the team has pursued its efforts on the development of the EnosLib library and the resulting artifacts to help researchers perform experiment campaigns with direct contributions from several research engineers of the team.

Finally, on the platform side, we continued our effort and took part in the different actions around the SLICES and SLICES-FR, see Section 10.

Awards

The team has received the 3rd Best paper award at the 16th IEEE/ACM International Conference on Utility and Cloud Computing (UCC 2023) for its paper 24.

Finally, it is noteworthy to mention the new position of Adrien Lebre as the leader of the Cloud, Network and system program at Inria since December 2023 as well as the co-leader of the French Cloud PEPR action.

New software, platforms, open data New software ENOS Name:

Experimental eNvironment for OpenStack

Keywords:

OpenStack, Experimentation, Reproducibility

Functional Description:

Enos workflow :

A typical experiment using Enos is the sequence of several phases: - enos up : Enos will read the configuration file, get machines from the resource provider and will prepare the next phase - enos os : Enos will deploy OpenStack on the machines. This phase rely highly on Kolla deployment. - enos init-os : Enos will bootstrap the OpenStack installation (default quotas, security rules, ...) - enos bench : Enos will run a list of benchmarks. Enos support Rally and Shaker benchmarks. - enos backup : Enos will backup metrics gathered, logs and configuration files from the experiment.

URL:

https://github.com/BeyondTheClouds/enos

Publication:

hal-01664515

Contact:

Adrien Lebre

Participants:

Mathieu Simonin, Marie Delavergne, Adrien Lebre, Baptiste Jonglez

Partner:

Orange Labs

EnOSlib Name:

EnOSlib is a library to help you with your experiments

Keywords:

Distributed Applications, Distributed systems, Evaluation, Grid Computing, Cloud computing, Experimentation, Reproducibility, Linux, Virtualization

Functional Description:

EnOSlib is a library to help you with your distributed application experiments. The main parts of your experiment logic is made reusable by the following EnOSlib building blocks:

- Reusable infrastructure configuration: The provider abstraction allows you to run your experiment on different environments (locally with Vagrant, Grid’5000, Chameleon and more) - Reusable software provisioning: In order to configure your nodes, EnOSlib exposes different APIs with different level of expressivity - Reusable experiment facilities: Tasks help you to organize your experimentation workflow.

EnOSlib is designed for experimentation purpose: benchmark in a controlled environment, academic validation …

URL:

https://discovery.gitlabpages.inria.fr/enoslib/

Publications:

hal-01664515, hal-01689726

Contact:

Mathieu Simonin

Participants:

Mathieu Simonin, Baptiste Jonglez, Marie Delavergne, Alexis Bitaillou

Concerto Name:

Concerto

Keywords:

Reconfiguration, Distributed Software, Component models, Dynamic software architecture

Functional Description:

Concerto is an implementation of the formal model Concerto written in Python. Concerto allows to : 1. describe the life-cycle and the dependencies of software components, 2. describe a components assembly that forms the overall life-cycle of a distributed software, 3. automatically reconfigure a Concerto assembly of components by using a set of reconfiguration instructions as well as a formal operational semantics.

URL:

https://gitlab.inria.fr/VeRDi-project/concerto

Publications:

hal-03103714, hal-02535077, hal-01897803

Contact:

Hélène Coullon

Participants:

Christian Perez, Hélène Coullon, Maverick Chardet, Simon Robillard

Partners:

IMT Atlantique, LS2N, LIP

6TiSCH Simulator Name:

High-level simulator of a 6TiSCH network

Keywords:

Network simulator, 6TiSCH

Functional Description:

The simulator is written in Python. While it doesn't provide a cycle-accurate emulation, it does implement the functional behavior of a node running the full 6TiSCH protocol stack. This includes RPL, 6LoWPAN, CoAP and 6P. The implementation work tracks the progress of the standardization process at the IETF.

Publication:

hal-03919553

Contact:

Malisa Vucinic

Participant:

Remous Koutsiamanis

New platforms OpenStack Geo JohnsAntonyMarieDelavergneBaptisteJonglezMatthieuRakotojaona RainimangaveloAdrienLebre

OpenStack is the de facto open-source management system to operate and use Cloud Computing infrastructure. Started in 2012, the OpenStack foundation gathers 500 organizations including groups such as Intel, AT&T, RedHat, etc. The software platform relies on tens of services with a 6-month development cycle. It is composed of more than 2 millions of lines of code, mainly in Python, just for the core services. While these aspects make the whole ecosystem quite swift, they are also good signs of maturity for this community.

We created and animated between 2016 and 2018 the Fog/Edge/Massively Distributed (FEMDC) Special Interest Group and have been contributing to the Performance working group since 2015. The former investigates how OpenStack can address Fog/Edge Computing use cases whereas the latter addresses scalability, reactivity and high-availability challenges. In addition to releasing white papers and guidelines 54, the major result from the academic view point is the aforementioned EnOS solution, a holistic framework to conduct performance evaluations of OpenStack (control and data plane). In May 2018, the FEMDC SiG turned into a larger group under the control of the OpenStack foundation. This group gathers large companies such as Verizon, ATT, etc. Although our involvment has been less important since 2020, our participation is still signficant. For instance, we co-signed the second white paper delivered by the edge working group in 2020 55 and are still taking part to the working group, investigating new use-cases as well as resulting challenges.

Grid'5000 Remous-ArisKoutsiamanisBaptisteJonglezAdrienLebreJean MarcMenaud

Grid'5000 is a large-scale and versatile testbed for experiment-driven research in all areas of computer science, with a focus on parallel and distributed computing including Cloud, HPC and Big Data. It provides access to a large amount of resources: 12000 cores, 800 compute-nodes grouped in homogeneous clusters, and featuring various technologies (GPU, SSD, NVMe, 10G and 25G Ethernet, Infiniband, Omni-Path) and advanced monitoring and measurement features for traces collection of networking and power consumption, providing a deep understanding of experiments. It is highly reconfigurable and controllable. STACK members are strongly involved into the management and the supervision of the testbed, notably through the steering committee or the SeDuCe testbed described hereafter.

SeDuCe Remous-ArisKoutsiamanisBaptisteJonglezJean MarcMenaud

The SeDuCe Project aims to deliver a research testbed dedicated to holistic research studies on energetical aspects of datacenters. Part of the Grid'5000 Nantes site, this infrastructure is composed of probes that measure the power consumption of each server, each switch and each cooling system, and also measure the temperature at the front and the back of each servers. These sensors enable reasearch to cover the full spectrum of the energetical aspect of datacenters, such as cooling and power consumption depending of experimental conditions.

The testbed is connected to renewable energy sources (solar panels). This “green” datacenter enables researchers to perform real experiment-driven studies on fields such as temperature based scheduling or “green” aware software (i.e., software that take into account renewable energies and weather conditions).

PiSeDuCe Remous-ArisKoutsiamanisBaptisteJonglezJean MarcMenaud

In 2023, we consolidated the development of PiSeDuCe, a deployment and reservation system for Edge Computing infrastructures composed of multiple Raspberry Pi Cluster started in 2020. Typically, a cluster of 8 Raspberry Pi costs less than 900 euros and only needs an electrical outlet and a wifi connection for its installation and configuration. Funded by the CNRS through the Kabuto project, and in connection with the SLICES-FR initiative, we have extended PiSeduce to propose a device to cloud deployment system (from devices on Fit IoTLab to servers in Grid’5000). PiSeDuCe and SeDuce led us to submit the Samurai CPER proposal.

SAMURAI Remous-ArisKoutsiamanisBaptisteJonglezJean MarcMenaud

In 2022 the SAMURAI (Sustainable And autonoMoUs gReen computing for AI) project was accepted, as a part of the energy and digital transition theme. The project aims at reinforcing an innovative hardware infrastructure for the scientific study of the intersecting problems of computing infrastructure that supports artificial intelligence and its energy autonomy. SAMURAI will extend SeDuCe into energy autonomy by adding a smart and clean energy storage system. Additionally, it will extend the capabilities of the platform by adding AI computing nodes (GPUs) for the scientific study of AI tools. Finally, it will also add new sensor nodes within the Nantes connected object platform (Nantes nodes of the national FIT IoT-Lab platform) to support future work on embedded AI. In 2023, We built the IoT platform and network integration with the SeDuCe and G5K platform.

SLICES-FR/SLICES Remous-ArisKoutsiamanisBaptisteJonglezAdrienLebreJean MarcMenaud

STACK Members are involved in the definition and bootstrapping of the SLICES-FR infrastructure. This infrastructure can be seen as a merge of the Grid'5000 and FIT testbeds with the goal of providing a common platform for experimental Computer Science (Next Generation Internet, Internet of things, clouds, HPC, big data, etc.). In 2022, STACK contributions are mainly related to SLICES, the European initiative relative. In particular, members have taken part to the SLICES-DS project (Design Study) as well as the SLICES-PP project (Preparatory Phase) of the SLICES-RI action .

New results Resource Management OnsAouediDanielBalouekHélèneCoullonRemous-ArisKoutsiamanisDuc ThinhNgoKandarajPiamratThomasLedouxPierreJacquet

The evolution of the cloud computing paradigm in the last decade has amplified the access to on-demand services (economically attractive, easy-to-use manner, etc.). However, the current model, built upon a few large datacenters (DCs), may not be suited to guarantee the needs of new use cases, notably the boom of the Internet of Things (IoT). To better respond to the new requirements (in terms of delay, traffic, etc.), compute and storage resources should be deployed closer to the end-user, forming with the national and regional data centers a new computing continuum. The question is then how to manage such a continuum to provide end-users the same services that made the current cloud computing model so successful. In 2023, we have continued our effort to answer this question and delivered multiple contributions. We also start new activities around Urgent Computing (see Section 4).

Network resource management:

The team has continued activities covering the Cloud–IoT continuum. First, an architecture that cover the whole continuum has been proposed in 14. It deploys hierachical federated learning (HFL) to cope with limitations of traditional FL, which suffers from several issues such as (i) robustness, due to a single point of failure, (ii) latency, as it requires a significant amount of communication resources, and (iii) convergence, due to system and statistical heterogeneity. HFL adds the edge servers as an intermediate layer for sub-model aggregation, several iterations will be performed before the global aggregation at the cloud server takes place, thus making the overall process more efficient, especially with non-independent and identically distributed (non-IID) data. Moreover, using traditional Artificial Neural Networks (ANNs) with HFL consumes a significant amount of energy, further hindering the application of decentralized FL on energy-constrained mobile devices. Therefore, this paper presents HFedSNN: an energy-efficient and fast-convergence model by incorporating Spike Neural Networks (SNNs) within HFL. SNN is a generation of neural networks, which promises tremendous energy and computation efficiency improvements. Taking advantage of HFL and SNN, numerical results demonstrate that HFedSNN outperforms FL with SNN (FedSNN) in terms of accuracy and communication overhead by 4.48% and 26×, respectively. Furthermore, HFedSNN significantly reduces energy consumption by 4.3× compared to FL with ANN (FedANN).

Moreover, as the complexity and scale of modern networks continue to grow, the need for efficient, secure management, and optimization becomes increasingly vital. Digital twin (DT) technology has emerged as a promising approach to address these challenges by providing a virtual representation of the physical network, enabling analysis, diagnosis, emulation, and control. The emergence of Software-defined network (SDN) has facilitated a holistic view of the network topology, enabling the use of Graph neural network (GNN) as a data-driven technique to solve diverse problems in future networks. The survey 10 explores the intersection of GNNs and Network digital twins (NDTs), providing an overview of their applications, enabling technologies, challenges, and opportunities. We discuss how GNNs and NDTs can be leveraged to improve network performance, optimize routing, enable network slicing, and enhance security in future networks. Additionally, we highlight certain advantages of incorporating GNNs into NDTs and present two case studies. Finally, we address the key challenges and promising directions in the field, aiming to inspire further advancements and foster innovation in GNN-based NDTs for future networks.

With an orientation specific to 5G networks and cloud computing, in 18 we introduce the Mobile Edge Slice Broker (MESB), a novel solution for orchestrating the deployment and operation of mobile edge slices in multi-cloud environments. We focus on reconciling the challenges faced by mobile edge slice tenants in selecting Cloud Infrastructure Providers (CIPs), ensuring seamless integration of network functions, services, and applications while maintaining quality of service (QoS). Our MESB architectural design enables dynamic monitoring, reconfiguration, and optimization of mobile edge slices to meet evolving requirements. This work represents a first step in enhancing the efficiency and effectiveness of resource management in the convergence of telecom and cloud services in 5G networks.

In line with the previous works in network resource management, in 11 we continue to focus on efficient and secure network management by applying federated learning specifically to 5G cellular traffic prediction. In this work we emphasize the viability of various neural network models, including RNNs and CNNs, in a federated setting for time-series forecasting, while also exploring the effectiveness of different federated aggregation functions in handling non-identically distributed data. Additionally, we demonstrate that preprocessing techniques at local base stations significantly improve prediction accuracy, and that federated learning offers substantial benefits in reducing carbon emissions, computational demands, and communication costs, making it a viable and efficient approach for large-scale and intelligent resource management in network systems.

CloudFactory:

19 introduces CloudFactory, an IaaS workload generator, as a solution to address the representativeness problem. This tool includes a library for sharing IaaS statistics and a generator for producing realistic VM workloads based on these statistics. CloudFactory is offered as open-source software for use by Cloud providers and researchers to facilitate the evaluation of new contributions. To demonstrate the relevance of CloudFactory, we discuss an analysis of the scheduling evolution for different IaaS workload intensities of two different cloud providers (Microsoft Azure and Chameleon). OVHcloud statistics, computed from CloudFactory, are also reported, and a comparison is made with other Cloud providers.

Urgent Computing:

The Continuum Computing encompasses the integration of diverse infrastructures, including cloud, edge, and fog, to facilitate seamless migration of applications based on their specific needs, ensuring optimal satisfaction of their requirements. In particular, Urgent applications aim at processing distributed data sources while considering the uncertainty of infrastructure and timeliness constraints of data-driven workflows. The primary obstacles in this particular context mostly pertain to the incapacity to promptly respond to changes in the environment or the quality of service (QoS) constraints of the application, as well as the incapability to maintain an application in a stateless manner, hence impeding its relocation without the risk of data loss. This line of work is highly motivated by natural disaster case studies. In 17, we leverage three different platforms to combine smoke detection at the edge using camera imagery, wildfire simulations in the cloud, and finally heavy HPC models for generating pollution concentration maps. The resulting workflow is a prime example of a Cloud-to-IoT continuum use case that monitors and manages the air quality impacts of remote wildfires. In this context, we frame challenges and discuss how the R-Pulsar programming system can support urgent data-processing pipelines that tradeoff the content of data, cost of computation, and urgency of the results.

In 24, we tackle the aforementioned issues through the introduction of a framework based on a Function-as-a-Service (FaaS) and event-driven architecture. This framework enables the decomposition, localization, and relocation of applications inside a Continuum infrastructure, facilitated by a rule engine that is both system and data-aware. Results highlight the use the execution history as a basis for determining the optimal location for executing a function, hence guaranteeing the desired Quality of Service (QoS). The evaluation was carried out by simulating a smoke detection use case as a representative scenario for Urgent Computing. We extended this approach in the context of the TEMA Horizon Europe project that aims at addressing Natural Disaster Management 25. We introduced the Urgent Function Enabler (UFE) platform, a fully distributed architecture able to define, spread, and manage FaaS functions, using local IoT data and a computing infrastructure composed of mobile and stable nodes.

This approach presents significant potential in numerous fields. We are currently exploring a climate risk management framework based on real data from the Danish Meteorological and Hydrological Institutes (DMI, and DHI), coupled with road condition (pavement bearing capacity) data from authorities 23. The aim of this project is to enhance a risk assessment model for evaluating pavement structure functionality during extreme weather events.

Programming Support AbdelghaniAlidraDanielBalouekHélèneCoullonMohamedGraietThomasLedouxJacquesNoyéAntoineOmondEloiPerdereauJolanPhilippeHibaAwadMarioS"udholtDiviDe Lacour

Fog Modeling:

In 6, we propose a detailed overview of the current state-of-the-art in terms of Fog modeling languages. We relied on our long-term experience in Cloud Computing and Cloud Modeling to contribute a feature model describing what we believe to be the most important characteristics of Fog modeling languages. We also performed a systematic scientific literature search and selection process to obtain a list of already existing Fog modeling languages. Then, we evaluated and compared these Fog modeling languages according to the characteristics expressed in our feature model. As a result, we discuss in this paper the main capabilities of these Fog modeling languages and propose a corresponding set of open research challenges in this area.

The size, complexity, and heterogeneity of Fog systems throughout their life cycle poses challenges and potential costs. To address this, 15 proposes a generic model-based approach called VeriFog, utilizing a customizable Fog Modeling Language (FML) for verifying Fog systems during the design phase. The approach was tested with three use cases from different application domains, considering various non-functional properties for verification (energy, performance, security). Developed in collaboration with the industrial partner Smile, this approach represents a crucial step towards a comprehensive model-based support for the entire life cycle of Fog systems.

Configuration languages:

In the context of the OTPaaS project, we have started to study the anatomy of configuration languages with the objective of unveiling the building blocks of such languages. In particular, we have focused on a very expressive, and declarative, new configuration language, called CUE (CUE stands for Configure Unify Execute). CUE is open source, inspired by previous work at Google. It represents configurations in a way inspired by typed feature structures, a concept inherited from computational linguistics. This is not the first time that a language is built on top of this concept: LIFE, built in the 80s, revolves around psi-terms, a variant of types feature structures. Whereas CUE is a domain-specific language, LIFE is a generic language, which combines logic programming, functional programming and object-oriented programming. As such, it potentially provides a general framework to relate configuration as data and computations. We have presented our initial findings on the relationship between CUE and LIFE at CONFLANG '23, a SPLASH workshop 29.

Software deployment/reconfiguration:

For a few years, the team has been working on deployment and dynamic reconfiguration of distributed software systems through the tool Concerto 42. By offering a programmable lifecycle of resources (services, devices etc.), the team has shown multiple times that the level of parallelism and asynchronism when executing reconfigurations can be increased, thus reducing the execution time of reconfigurations. The accumulated knowledge on component-based distributed software reconfiguration has made possible the publication in 2023 of a survey in ACM Computing Surveys 9, a great success. In particular, this survey analyzes the state of the art on reconfiguration through the spectrum of verification. Indeed, given the complexity of distributed software and the adverse effects of incorrect reconfigurations, a suitable methodology is needed to ensure the correctness of reconfigurations in component-based systems. This survey gives the reader a global perspective over existing formal techniques that pursue this goal. It distinguishes different ways in which formal methods can improve the reliability of reconfigurations, and lists techniques that contribute to solving each of these particular scientific challenges.

In addition to this survey, last year the team has worked on the decentralized version of Concerto, namely Concerto-D. Such decentralized reconfiguration tool is very useful when building global knowledge on the state of the system is difficult, impossible, or very costly because of faults, disconnection or scale. If a publication on Concerto-D itself is not yet published, the usage of Concerto-D has been explored and published on three use cases describe below.

First, Cyber-Physical Systems (CPS) deployed in scarce resource environments like the Arctic Tundra (AT) face extreme conditions. Nodes deployed in such environments have to carefully manage a limited energy budget, forcing them to alternate long sleeping and brief uptime periods. During uptimes, nodes can collaborate for data exchanges or computations by providing services to other nodes. Deploying or updating such services on nodes requires coordination to prevent failures (e.g., sending new/updated API, waiting for service activation/deactivation, etc.). In 20, we study analytically and experimentally, to what extent using relay nodes for communications reduces the deployment and update durations (reconfiguration), and which factors influence this reduction. Intuitively, as dealing with sleeping nodes with low chances of uptime overlaps (no uptime synchronization), a coordinated deployment or update takes very long to finish if nodes have to communicate directly. In 21 we evaluate and study nodes' energy consumption during reconfiguration tasks coordination according to different CPS configurations (i.e., number of nodes, uptime duration, radio technology, or relay node availability). Results show high energy consumption in scenarios where nodes wake up specifically to deploy/update. It is shown that it is beneficial to do adaptation tasks while overlapping with existing uptimes (i.e., reserved for observation activities). This paper also evaluates and studies how nodes' uptime duration and relay node availability influence energy consumption. Increasing uptime duration can reduce energy consumption, up to 12

Second, Concerto-D has been studied and used in the context of the SeMaFoR ANR project. The context of this project is Fog Computing, a paradigm aiming to decentralize the Cloud by geographically distributing away computation, storage and network resources as well as related services. This notably reduces bottlenecks and data movement. However, managing Fog resources is a major challenge because the targeted systems are large, geographically distributed, unreliable and very dynamic. Cloud systems are generally managed via centralized autonomic controllers automatically optimizing both application QoS and resource usage. To leverage the self-management of Fog resources, we propose to orchestrate a fleet of autonomic controllers in a decentralized manner, each with a local view of its own resources. In 13, we present our SeMaFoR (Self-Management of Fog Resources) vision that aims at collaboratively operating Fog resources. SeMaFoR is a generic approach made of three cornerstones: an Architecture Description Language for the Fog, a collaborative and consensual decision-making process, and an automatic coordination mechanism for reconfiguration (based on Concerto-D).

Finally, the usage of Concerto-D has been studied on the particular case of Urgent Computing. This field considers sudden situations of danger such as climate disasters, fires and air pollution, which happen without control in a possibly short window frame, and puts people's lives in danger. In such situation, some applications to help solve the situation have to be dynamically deployed, or existing applications have to be updated in a short period of time and according to new unanticipated constraints. The study in 16 provides a justification for the necessity of dynamic adaptation in applications that are time-sensitive. Furthermore, we present our viewpoint on time-sensitive applications and undertake a thorough analysis of the underlying principles and challenges that need to be resolved in order to accomplish this goal.

Decentralized federated learning:

Federated Learning (FL) collaboratively trains machine learning models on the data of local devices without having to move the data itself: a central server aggregates models, with privacy and performance benefits but also scalability and resilience challenges. We have presented FDFL citedelacour:hal-03946638, a new fully decentralized FL model and architecture that improves standard FL scalability and resilience with no loss of convergence speed. FDFL provides an aggregator-based model that enables scalability benefits and features an election process to tolerate node failures. Simulation results show that FDFL scales well with network size in terms of computing, memory, and communication compared to related FL approaches

Energy-aware computing Remous-ArisKoutsiamanisJean-MarcMenaud

The activities on this axis are mainly related to the design, development and deployment of the SAMURAI project(7.2.5), a testbed that will allow researchers to investigate energy related challenges over the computing continuum (from the Cloud to IoT devices/cyber physical systems).

Additionally, in 22, we aim to address the sustainability of machine learning (ML) models in the context of cellular traffic forecasting, specifically in the emerging 5G networks. We introduce a novel sustainability indicator to assess the trade-off between energy consumption and predictive accuracy of deep learning (DL) models in a federated learning (FL) framework. We then use it to demonstrate that while larger, more complex models offer slightly better accuracy, their significantly higher energy consumption and environmental impact make simpler models more viable for real-world applications, highlighting the importance of energy-aware computing in the development of distributed AI technologies for network management.

Security and Privacy OnsAouediWilmer EdicsonGarzon AlfonsoMohamedGraietKandarajPiamratMarioSüdholt

This year the STACK team has provided new results on security and privacy issues in the networking and biomedical domains. We have developed new AI-based methods that support new means of property analysis in both domains.

Network security and privacy:

The rapid development of network communication along with the drastic increase in the number of smart devices has triggered a surge in network traffic, which can contain private data and in turn affect user privacy. Recently, Federated Learning (FL) has been proposed in Intrusion Detection Systems (IDS) to ensure attack detection, privacy preservation, and cost reduction, which are crucial issues in traditional centralized machine-learning-based IDS. However, FL-based approaches still exhibit vulnerabilities that can be exploited by adversaries to compromise user data. At the same time, meta-models (including the blending models) have been recognized as one of the solutions to improve generalization for attack detection and classification since they enhance generalization and predictive performances by combining multiple base models. Therefore, we have proposed in 7 a Federated Blending model-driven IDS framework for the Internet of Things (IoT) and Industrial IoT (IIoT), called F-BIDS, in order to further protect the privacy of existing ML-based IDS. It consists of a Decision Tree (DT) and Random Forest (RF) as base classifiers to first produce the meta-data. Then, the meta-classifier, which is a Neural Networks (NN) model, uses the meta-data during the federated training step, and finally, it makes the final classification on the test set. Specifically, in contrast to the classical FL approaches, the federated meta-classifier is trained on the meta-data (composite data) instead of user-sensitive data to further enhance privacy. To evaluate the performance of F-BIDS, we used the most recent and open cyber-security datasets, called Edge-IIoTset (published in 2022) and InSDN (in 2020). We chose these datasets because they are recent datasets and contain a large amount of network traffic including both malicious and benign traffic.

Secure biomedical analyzes:

In Wilmer Garzon's PhD thesis 28, he has shown that researchers require new tools and techniques to address the restrictions and needs of global scientific collaborations over geo- distributed biomedical data. He has identified three kinds of constraints related to global collaborations, namely, technical, legal, and socioeconomic constraints.

He has proposed Fully Distributed Collaborations (FDC), which are research endeavors that provide means to analyze massive biomedical information collaboratively while respecting legal and socioeconomic restrictions.

A new random-forest based AI-algorithm MuSiForest improves over other existing AI approaches by ameliorating computation time and reducing the amount of shared data while having a training precision close to that of cen- tralized random forest techniques.

Finally, he has proposed FeDeRa, a language to specify, deploy, and execute FDC-compliant multi-site scientific workflows.

Bilateral contracts and grants with industry Bilateral contracts with industry Kelio (formely Bodet Software) ThomasLedoux

The ArchOps 2 Chair (for Architecture, Deployment and Administration of Agile IT Infrastructures) is an industrial chair of IMT Atlantique, in partnership with Kelio, an SME specialized in solutions for time and attendance management. It is dedicated to all IMT Atlantique students in the field of IT. It is also a channel for the transfer of high-level skills: researchers, experts and industrials.

In 2023, several activities were conducted, such as a hackathon with 100 participants, an event promoting the integration of international students at IMT Atlantique and preliminary discussions on the topic of a joint PhD thesis with the Stack team.

Bilateral grants with industry Alterway/Smile ThomasLedouxHibaAwad

In 2020, during the preparation of the ANR SeMaFoR project, we started a cooperation with Alterway/Smile, an SME specialized in Cloud and DevOps technologies. This cooperation resulted in a joint PhD thesis (called Cifre) entitled "A model-based approach for dynamic, multi-scale distributed systems" started in Nov. 2021.

The first publication to emerge from this collaboration will be presented at next year's SAC conference 15.

OVHcloud YasminaBouziemHeleneCoullonThomasLedouxPierreJacquet

In 2021, INRIA and OVHcloud have signed an agreement to jointly study the problem of a more frugal Cloud. They have identified 3 axes : (i) Software eco-design of Cloud services and applications; (ii) Energy efficiency leverages; (iii) Impact reduction and support for Cloud users. The Stack team obtained a PhD grant and a 24-month post-doc grant.

Pierre Jacquet started his PhD in October 2021, under a co-supervision with the Spirals team with the subject "Fostering the Frugal Design of Cloud Native Applications". Pierre presented a poster, ”Sampling Resource Requirements to Optimize Virtual Machines Sizing”, at the EuroSys conference in April 2022.

Yasmina Bouziem started her postdoc in September 2022 for two years, under a co-supervision with the Avalon team in Lyon with the subject "pulling the energy cost of a reconfiguration execution within reconfiguration decisions".

Orange PaulBoriDivide LacourAdrienLebreThomasLedouxDanFreeman MahoroJean-MarcMenaudDuc-ThinhNgoKandarajPiamratMarioSüdholt

Since 2022, Orange Labs and the Stack team have been launched several PhD grants.

Dan Freeman Mahoro has initiated a new collaboration on the theme of "digital twins". This cooperation has led to a joint doctoral thesis entitled "Digital twins of complex cyber-physical systems", which began in April 2022, co-supervised with the Orange team in Rennes, and has a first publication 26. However, Dan Freeman Mahoro stopped his PhD for several reasons in October, preferring to join the industry as an engineer.

Paul Bori started his PhD in January 2023, with the subject "Container application security: a programmable OS-level approach to monitoring network flows and process executions".

Duc-Thinh Ngo started his PhD in December 2022, on the subject "Dynamic graph learning algorithms for the digital twin in edge-cloud continuum", under a co-supervision with Orange team in Rennes.

Divi de Lacour started his PhD in January 2022, on the subject "Architecture et services pour la protection des données poursy stèmes coopératifs autonomes", under a co-supervision with the Orange team in Chatillon (Paris south region).

Ericsson SamiaBoutalbiMarioSüdholtRemous-ArisKoutsiamanis

Samia Boutalbi started her PhD in January 2022, on the subject "Secure deployment of micro-services in a shared Cloud RAN/MEC environment", under a co-supervision with the Ericsson team in Paris.

Partnerships and cooperations International research visitors Visits of international scientists Other international visits to the team Daniel Sokolowski Status

PhD

Institution of origin:

School of Computer Science, St. Gallen

Country:

Switzerland

Dates:

Dec 12 - 15, 2023

Context of the visit:

Invited talk to the VELVET days, and collaboration

Mobility program/type of mobility:

lecture

Shashikant Ilager Status

Postdoctoral Researcher

Institution of origin:

Vienna University Of Technology

Country:

Austria

Dates:

Oct 15 - Nov 31, 2023

Context of the visit:

Exploring collaboration on urgent compiting and energy efficiency; Writing of collaborative project between the two groups

Mobility program/type of mobility:

N/A

Manish Parashar Status

Professor

Institution of origin:

University of Utah

Country:

USA

Dates:

Nov 23 - Nov 24, 2023

Context of the visit:

Keynote; Exploring associate team frameworks with Inria teams

Mobility program/type of mobility:

N/A

European initiatives Horizon Europe SLICES-PP Remous-ArisKoutsiamanis[STACK representative]AdrienLebreBaptisteJonglezJean-MarcMenaud

SLICES-RI (Research Infrastructure), which was recently included in the 2021 ESFRI roadmap, aims at building a large infrastructure needed for the experimental research on various aspects of distributed computing, networking, IoT and 5/6G networks. It will provide the resources needed to continuously design, experiment, operate and automate the full lifecycle management of digital infrastructures, data, applications, and services. Based on the two preceding projects within SLICES-RI, SLICES-DS (Design Study) and SLICES-SC (Starting Community), the SLICES-PP (Preparatory Phase) project will validate the requirements to engage into the implementation phase of the RI lifecycle. It will set the policies and decision processes for the governance of SLICES-RI: i.e., the legal and financial frameworks, the business model, the required human resource capacities and training programme. It will also settle the final technical architecture design for implementation. It will engage member states and stakeholders to secure commitment and funding needed for the platform to operate. It will position SLICES as an impactful instrument to support European advanced research, industrial competitiveness and societal impact in the digital era.

The involvement of the group is rather low if we consider the allocated budget (4K€) but the group is strongly involved at the national level taking part to different discussions related to SLICES-FR, the French node of SLICES.

Other european programs/initiatives DI4SPDS (Distributed Intelligence for Enhancing Security and Privacy of Decentralised and Distributed Systems) KandarajPiamrat

Decentralised systems face challenges from sophisticated cyber-attacks that evolve and propagate to disrupt different parts. Additionally, communication overhead makes implementing authentication and access control complex. Existing approaches unlikely provide effective access control and multi-stage attack detection due to limited event capture and information correlation. This project offers a framework to improve security and privacy of decentralised systems through cross-domain access control, collaborative intrusion detection, and dynamic risk management considering resource consumption. It facilitates subsystem collaboration to prevent widespread disruption from attacks and share threat awareness. The project will develop methods and prototypes utilizing blockchain, federated learning, and multi-agent architecture to enhance access control, detection, risk management, and response capabilities.

The consortium is composed of four partners: Nantes University, LUT University (Finland), Universidad de Castilla - La Mancha (Spain), and Firat University (Turkey). LUT is the coordinator of the project

DI4SPDS has been accepted in July 2023, will be running from March 2024 for 36 months, with an allocated budget of 874k€ (230K€ for Stack).

National initiatives ANR SeMaFoR (Self-Management of Fog Resources) ThomasLedoux[coordinator]HélèneCoullonJolanPhilippe

Fog Computing is a paradigm that aims to decentralize the Cloud at the edge of the network to geographically distribute computing/storage resources and their associated services. It reduces bottlenecks and data movement. But managing a Fog is a major challenge: the system is larger, unreliable, highly dynamic and does not offer a global view for decision making. The objective of the SeMaFoR project is to model, design and develop a generic and decentralized solution for the self-management of Fog resources.

The consortium is composed of three partners: LS2N-IMT Atlantique (Stack, NaoMod, TASC), LIP6-Sorbonne Université (Delys), Alter way/Smile (SME). The Stack team supervises the project.SeMaFoR is running for 42 months (starting in March 2021 with an allocated budget of 506k€, 230K€ for Stack). See the Semafor web site for more information.

The main results of the year 2023 were the survey "A feature-based survey of Fog modeling languages" 6 in the FGCS journal and a position paper 1 in the SEAMS conference ranked 'A'.

PicNic (Transfert de grands volumes de données entre datacenters) Jean-MarcMenaud[STACK representative]Remous-ArisKoutsiamanisAdrienLebreAbdouSeck

Large dataset transfer from one datacenter to another is still an open issue. Currently, the most efficient solution is the exchange of a hard drive with an express carrier, as proposed by Amazon with its SnowBall offer. Recent evolutions regarding datacenter interconnects announce bandwidths from 100 to 400 Gb/s. The contention point is not the network anymore, but the applications which centralize data transfers and do not exploit parallelism capacities from datacenters which include many servers (and especially many network interfaces – NIC). The PicNic project addresses this issue by allowing applications to exploit network cards available in a datacenter, remotely, in order to optimize transfers (hence the acronym PicNic). The objective is to design a set of system services for massive data transfer between datacenters, exploiting distribution and parallelisation of networks flows.

The consortium is composed of several partners: Laboratoire d'Informatique du Parallélisme, Institut de Cancérologie de l’Ouest / Informatique, Institut de Recherche en Informatique de Toulouse, Laboratoire des Sciences du Numérique de Nantes, Laboratoire d'Informatique de Grenoble, and Nutanix France.

PiCNiC will be running for 42 months (starting in Sept 2021 with an allocated budget of 495k€, 170k€ for STACK).

PIA 4 OTPaaS AdrienLebre[STACK representative]FaridArfiAlexisBitaillouHélèneCoullonMarieDelavergneMohamedGraietJabranKhanRemous-ArisKoutsiamanisThomasLedouxJean-MarcMenaudAnasMokhtariClémentMommessinJacquesNoyéKandarajPiamratEloiPerdereauMatthieuRakotojaona RainimangaveloMarioSüdholtOnsAouediTayebDiab

The OTPaaS project targets the design and development of a complete software stack to administrate and use edge infrastructures for the industry sector. The consortium brings together national and user technology suppliers from major groups (Atos / Bull, Schneider Electric, Valeo) and SMEs / ETIs (Agileo Automation, Mydatamodels, Dupliprint, Solem, Tridimeo, Prosyst, Soben), with a strong support from major French research institutes (CEA, Inria, IMT, CAPTRONIC). The project started in October 2021 for a period of 36 months with an overall budget of 56M€ (1.2M€ for STACK).

The OTPaaS platform objectives are:

To be built on national and sovereign technologies for the edge cloud.

To be validated by industrial demonstrators of multisectoral use cases.

To be followed and supported by ambitious industrialization programs.

To be accompanied by a massive campaign to promote its use by SMEs / midcaps.

To integrate solutions for controlling energy consumption.

To be compliant with the Gaia-X ecosystem.

CPER SAMURAÏ Jean-MarcMenaud[coordinator]Remous-ArisKoutsiamanis

The SAMURAI (Sustainable And autonoMoUs gReen computing for AI) infrastructure aims to design an innovative hardware infrastructure for the scientific study of the cross-cutting issues of computing infrastructures supporting artificial intelligence and their energy autonomies.

This project paves the way toward a larger infrastructure at the natonial level in the context of the SLICES-FR initiative.

The project started in 2022 for a period of 5 years with an overall budget of 730K€ (500K€ for STACK).

Local and regional projects SysMics network MarioSüdholt

SysMics is an integrated cluster of research that is part of the Nantes Excellence Initiative in Medecine and Engineering. Its main objective is the development of new methods for precision medecine, in particular, based on genomic analyses. In this context, we have worked on new large-scale distributed biomedical analyses and provided several results on how to distributed popular statistical analyses, such as FAMD-based and EM-based analyses.

Inria Challenges FrugalCloud (Inria-OVHCloud) YasminaBouziemHélèneCoullonThomasLedouxPierreJacquet

A joint collaboration between Inria and OVHcloud company on the challenge of frugal cloud has been launched in October 2021 with a budget of 2 M€. It addresses several scientific challenges on the eco-design of cloud frameworks and services for large scale energy and environmental impact reduction, across three axes: i) Software eco-design of services and applications; ii) Efficiency leverages; iii) Reducing the impact and supporting users of the Cloud.

Pulse (Inria-Qarnot Computing) AdrienLebre

The joint challenge between Inria and Qarnot computing is called PULSE, for "PUshing Low-carbon Services towards the Edge". It aims to develop and promote best practices in geo-repaired hardware and software infrastructures for more environmentally friendly intensive computing.

The challenge is structured around two complementary research axes to address this technological and environmental issue:

Axis 1: "Holistic analysis of the environmental impact of intensive computing".

Axis 2: "Implementing more virtuous edge services"

The STACK group is mainly involved in the second axis, addressing the challenges related to data management.

Dissemination Promoting scientific activities DanielBalouekHélèneCoullonRemous-ArisKoutsiamanisAdrienLebreJean-MarcMenaudJacquesNoyéKandarajPiamratSüdholtMarioThomasLedoux

Scientific events: organisation General chair, scientific chair

Hélène Coullon: VELVET days at IMT Atlantique in Nantes the 13th and 14th of December: helene-coullon.fr/pages/velvet

Thomas Ledoux: Retirement Ceremony - Pierre Cointe (14th of September) webia.lip6.fr/ briot/hommage-cointe-140923

Daniel Balouek: International Workshop on Urgent Analytics for Distributed Computing (collocated with Euro-Par 2023). (Aug 28th) quickpar.github.io

Member of the organizing committees

Daniel Balouek: Publicity chair, ICPP23

Adrien Lebre: co-organizer of Middlewedge 2023 (workshop co-located with Middleware)

Scientific events: selection Chair of conference program committees

Daniel Balouek: Middleware for the Edge Workshop - Dec 11 (MiddleWedge 2023) at Middleware23

Member of the conference program committees

Kandaraj Piamrat: IEEE ICC (CSM symposium) 2023, IEEE CCNC (IoT: From Sensors to Vertical Applications Track) 2023, IEEE IWQoS 2023, IEEE WiMob 2023, IEEE ICC Workshop on Data Driven Intelligence for Networks and Systems (DDINS) 2023, ACM CoNEXT 2023 workshop SAFE, Algotel/Cores 2023

Hélène Coullon: SBAC-PAD, ICE workshop, QuickPar (workshop), algotel item Adrien Lebre: IEEE CloudCom, IEEE IC2E, IEEE ICFEC, MiddleWedge workshop (co-located with ACM Middleware)

Thomas Ledoux: IEEE ASMECC 2023 (Workshop on Autonomic and Self-* Management for the Edge-Cloud Continuum)

Daniel Balouek: Euro-Par 2023, SuperComputing 2023 Reproducibility challenge, ICCS 2023, UCC 2023

Remous-ArisKoutsiamanis : IEEE ICC 2023, IEEE ISCC 2023, ICCS (International Conference on Computational Science) 2023, MiddleWedge workshop (co-located with ACM Middleware), International Workshop on Urgent Analytics for Distributed Computing (collocated with Euro-Par 2023), FCN'2023 MWN (Mobile and Wireless Networking Symposium), Algotel/Cores 2023

MarioSüdholt : CloudCom'23, ICSSP'23

Reviewer

Helene Coullon: JLAMP

Daniel Balouek: QuickPar Workshop 2023

Journal Member of the editorial boards

Adrien Lebre : Associate Editor IEEE Transactions on Cloud Computing

Mario Südholt: Advisory Board member of the Programming Journal

Reviewer - reviewing activities

Kandaraj Piamrat: IEEE Vehicular Technology Magazine, IEEE Network Magazine, International journal of intelligent systems, Elsevier-Journal of systems architecture, Elsevier-Computer communications

Remous-ArisKoutsiamanis : IEEE Access, Elsevier Internet of Things, IEEE Transactions on Mobile Computing, Elsevier Computer Networks, Elsevier Ad Hoc Networks, Elsevier Results in Engineering, Elsevier Expert Systems With Applications, Elsevier Wireless Networks, Springer Journal of Network and Systems Management,

Invited talks

Thomas Ledoux: SIF (Société Informatique de France) annual conference : "Frugalité numérique : vers une approche holistique et centrée sur l'humain pour les systèmes distribués"

Daniel Balouek: Distributed Computing Continuum (DCC) at IEEE Cloud 2023 : "Leveraging the Computing Continuum for Urgent Science"

Adrien Lebre: Cloud Computing lecture at the RSD Winter school (France)

Leadership within the scientific community

HeleneCoullon is co-chair of the working group YODA (trustworthY and Optimal Dynamic Adaptation) in the national GDR GPL (software engineering and languages)

HeleneCoullon is co-chair of the working group CyclOps (DevOps, deployment and reconfiguration, lifecycle etc.) in the joint laboratory between Inria and Orange

HeleneCoullon is Vice-president of the French ACM SIGOPS group

Scientific expertise

JacquesNoyé has acted as a reviewer for the Swiss National Science Foundation.

Remous-ArisKoutsiamanis has acted as a reviewer for the National Science Centre Poland.

DanielBalouek has acted as a reviewer for the USA National Science Foundation.

Research administration

Adrien Lebre : Member of the executive committee of the Grid’5000 GIS (Groupement d’intérêt scientifique), Member of the SLICES-FR temporary executive board, Co-director of the <I/O> Lab, a joint lab between Inria and Orange Labs, co-leader the French CLOUD PEPR, and since December 2023, head of the Cloud, Network and System program at Inria.

KandarajPiamrat is international coordinator of LS2N (Laboratoire des sciences du numérique de Nantes).

Jean-Marc Menaud is deputy head of LS2N (Laboratoire des sciences du numérique de Nantes)

Remous-ArisKoutsiamanis : Member of the executive committee of the Grid’5000 GIS (Groupement d’intérêt scientifique), Member of the SLICES-FR temporary executive board, Member of the SLICES-FR Architects committee, Member of the SLICES-FR Architects committee, Member of the SLICES-FR Site managers, Scientific representative of IMT to SLICES.

Teaching - Supervision - Juries DanielBalouekHélèneCoullonRemous-ArisKoutsiamanisAdrienLebreJean-MarcMenaudJacquesNoyéKandarajPiamratSüdholtMarioThomasLedoux

Teaching

As a team mainly composed of Associate Prof. and Prof., the amount of teaching activities is significant (around 150hours per person). We present here only the management activities.

HélèneCoullon : responsible for the LOGIN training in computer science at IMT Atlantique (last year)

AdrienLebre : responsible for the FISE research program at IMT Atlantique (Nantes campus).

ThomasLedoux : head of the apprenticeship program in Software Engineering FIL. This 3-year program leads to the award of a Master degree in Software Engineering from the IMT Atlantique.

Thomas Ledoux: member of the teaching committee at IMT Atlantique

Thomas Ledoux: co-pilot of the redesign of the common core of the 1st year of the IT curriculum at IMT Atlantique

ThomasLedoux : head of the Filière informatique nantaise since Sept. 2020. This entity, created by the University of Nantes, Centrale Nantes and IMT Atlantique, aims to bring together the main players in Computer Science training in Nantes to ensure a coherent and ambitious training offer that meets the present and future challenges of Computer Science. It is organized around a Council made up of representatives from the academic and socio-economic worlds.

ThomasLedoux : member of the board of directors of Talents du numérique.

JacquesNoyé : deputy head of the Automation, Production and Computer Sciences Department of IMT Atlantique.

MarioSüdholt : representative for MSc-level and PhD-level studies of the API department of IMT Atlantique.

MarioSüdholt : coordinator of the international PhD program SEED.

KandarajPiamrat : elected member of scientific council at Faculty of Sciences and Techniques, Nantes University

KandarajPiamrat : responsible for Licence MIAGE, CS Department, Nantes University

Supervision

PhD: Marie Delavergne, Cheops, a service-mesh to geo-distribute micro-service applications at the Edge 27, Sept 2019 - March 2023, Director: A. Lebre.

PhD: Wilmer Garzon, Secure distributed workflows for biomedical data analytics, Sept 2019 - Jan 2023, Director: M. Südholt, Advisor: D. Benavides.

PhD in progress: Geo Johns Antony, Resource management in geo-distributed infrastructures: the Kubernetes case, April 2021 - Sept 2024, Director: A. Lebre.

PhD in progress: Samia Boutalbi, Secure deployment of microsservices in a shared RAN/MEC Cloud environment, Jan 2022 - Jan 2025, Director: M. Südholt, Advisor: R.-A. Koutsiamanis.

PhD in progress: Abdou Seck, Parallel transfer service for the exchange of large volumes of data between Datacenters, June 2022 - June 2025, Director: J.-M. Menaud, Noel De Palma, Advisor: R.-A. Koutsiamanis.

PhD in progress: Pierre Jacquet, Fostering the Frugal Design of Cloud Native Applications, Oct 2021 - Oct 2024, Director R. Rouvoy (SPIRALS Lille), Advisor: T. Ledoux.

PhD in progress: Hiba Awad, A Model-based Approach for Multi-Scale and Dynamic Distributed Systems, Nov. 2021 - March 2025. Director: T. Ledoux.

PhD in progress: Antoine Omond, Safe, efficient and low-energy self-adaptation for Cyber Physical Systems - Application to a scientific observatory in the Arctic tundra, Dec 2021 - Dec 2024. Director: T. Ledoux, Advisor: H. Coullon.

PhD in progress: Dan Freeeman Mahoro, Digital twins of complex cyber-physical systems, Apr. 2022 - March 2025, Director: T. Ledoux.

PhD in progress: Duc Thinh Ngo, Dynamic graph learning algorithms for the digital twin in edge-cloud continuum, Dec.2022 - Nov.2025, Director: A. Lebre, Advisor:: K. Piamrat

PhD in progress: Tengfei An, Modeling and Studying Self-Stabilization within Kubernetes, Nov 2023 - Sept 2026, Director: A. Lebre, Advisors: H. Coullon, J. Noyé

PhD in progress: Divi De Lacour, "Architecture et services pour la protection des données poursy stèmes coopératifs autonomes", Jan 2022 - Dec 2025, Director: M. Südholt.

Post-doc: Yasmina Bouizem, Measure and model the energy consumption of the procedures of deployment and reconfiguration at OVH, Sept 2022 - Sept 2023, Advisor: H. Coullon.

Post-doc: Eloi Perdereau, Configuration languages, OTPaaS, Sept 2022 - March 2024, Advisor: J. Noyé.

Post-doc: Ons Aouedi, Advanced AI for infrastructures and applications in the IoT-Cloud continuum, Jan.- Nov. 2023, Advisor: K. Piamrat, M. Südholt

Post-doc: Farid Arfi, Safety of decentralized reconfigurations, Feb. 2023 - Aug. 2024, Advisor: H. Coullon

Post-doc: Tayeb Diab, Decentralized AI for the continuum Cloud-Edge-IoT, Nov. 2022-Apr. 2024, Advisor: M. Südholt

Post-doc: Jabran Khan, Placement in Cloud, Edge and Far Edge devices in the IoT to Cloud continuum, Feb. 2023 - Jul. 2024, Advisor: R.-A. Koutsiamanis, J.-M. Menaud

Post-doc: Anas Mokhtari, A Holistic Approach for Designing Carbon-Aware and Energy-Aware Cloud applications, Sept. 2023 - Aug. 2024, Advisor: T. Ledoux.

Post-doc: Clément Mommessin, Model of Energy consumption in Cloud, Edge and Far Edge devices in the IoT to Cloud continuum, Jan. 2023 - Jun. 2024, Advisor: J.-M. Menaud, R.-A. Koutsiamanis

Post-doc: Jolan Philippe, Decentralization of distributed systems reconfiguration by using constraint programming, Feb. 2023 - Aug. 2024, Advisor: H. Coullon

Juries

Kandaraj Piamrat and Helene Coullon were members of the selection commitee for an associate professor positions in Rennes.

Kandaraj Piamrat was an examinator of Sabra Ben Saad Thesis - Security architectures for network slice management for 5G and beyond. EURECOM, 02/23.

Kandaraj Piamrat was an examinator of Mustafa Al Samara Thesis - Network Anomaly Detection in IoT: from Data Efficiency to Enhancing QoS, 06/23.

Helene Coullon was member of the selection committee for research positions at Centre Inria de l'Université de Rennes

Helene Coullon was an examinator of the PhD defense of Edouard Guégain at Université de Lille - Configuration-driven Software Optimization, 2023-09-29

Helene Coullon was an examinator of the PhD defense of Davaadorj Battulga at Université de Rennes 1 - Contributions to the Management of Stream Processing Pipelines in Fog Computing Environments, 2023-03-23

Adrien Lebre was a reviewer for the PhD thesis of Nihel Kaboubi, “Système de partitionnement hybride pour une inférence distribuée et embarquée des CNNs sur les équipements en bordure de réseau”, Université de Grenoble Alpes, March 2023

Adrien Lebre was reviewer for the Phd Thesis of Thomas Feltin, “Distributed Computing at the Smart Device Edge”, Ecole Doctorale de l’Institut Polytechnique de Paris, Dec 2023

Thomas Ledoux was member of the selection commitee for an associate professor position in IMT Atlantique

Thomas Ledoux was chairman of the PhD thesis jury of Marie Delavergne ("Cheops, a service-mesh to geo-distribute micro-service applications at the Edge") 27, 2023-03-16

Jean-Marc Menaud was member of the selection commitee for two associate professor positions in Toulouse and Lyon.

Popularization AdrienLebreThomasLedoux

Articles and contents

Adrien Lebre - "Quèsaco le cloud ?", www.imt-atlantique.fr/fr/actualites/quesaco-le-cloud

Thomas Ledoux - "La chasse au gaspillage dans le cloud et les data centers.", The Conversation, January 2023 30

Education

Thomas Ledoux was co-leader of the Ecolog initiative. This national project is the result of an association between the Institut du Numérique Responsable (INR) and the IT sector in Nantes, which includes the University of Nantes, IMT Atlantique and Centrale Nantes. Its ambition was to create a body of training courses available in open source and dedicated to sustainable computing. A webinar presenting the results was held on March 23, 2023 (ecolog.isit-europe.org).

Interventions

Thomas Ledoux took part in a round table related to digital sustainability/green computing organized by ADN Ouest (Nov. 14, 2023).

SeMaFoR - Self-Management of Fog Resources with Collaborative Decentralized Controllers A. Abdelghani Alidra H. Hugo Bruneliere H. Hélène Coullon T. Thomas Ledoux C. Charles Prud'Homme J. Jonathan Lejeune P. Pierre Sens J. Julien Sopena J. Jonathan Rivalan SEAMS 2023 - IEEE/ACM 18th Symposium on Software Engineering for Adaptive and Self-Managing Systems Melbourne, Australia May 2023 IEEE 25-31 F-BIDS: Federated-Blending based Intrusion Detection System O. Ons Aouedi K. Kandaraj Piamrat Pervasive and Mobile Computing 2023 89 101750 Component-Based Distributed Software Reconfiguration: a Verification-Oriented Survey H. Hélène Coullon L. Ludovic Henrio F. Frédéric Loulergue S. Simon Robillard ACM Computing Surveys May 2023 56 1 1-37 Towards Scalable Resilient Federated Learning: A Fully Decentralised Approach D. Divi De Lacour M. Marc Lacoste M. Mario Südholt J. Jacques Traoré PeRConAI 2023: 2nd IEEE Workshop on Pervasive and Resource-constrained Artificial Intelligence Atlanta (GA), United States 2023 IEEE Federated learning for 5G base station traffic forecasting V. Vasileios Perifanis N. Nikolaos Pavlidis R.-A. Remous-Aris Koutsiamanis P. Pavlos Efraimidis Computer Networks August 2023 235 1389-1286 109950 A feature-based survey of Fog modeling languages A. Abdelghani Alidra H. Hugo Bruneliere T. Thomas Ledoux Future Generation Computer Systems January 2023 138 104-119 F-BIDS: Federated-Blending based Intrusion Detection System O. Ons Aouedi K. Kandaraj Piamrat Pervasive and Mobile Computing 2023 89 101750 Integrating request replication into FaaS platforms: an experimental evaluation Y. Yasmina Bouizem D. Djawida Dib N. Nikos Parlavantzas C. Christine Morin Journal of Cloud Computing: Advances, Systems and Applications June 2023 12 1 1-20 Component-Based Distributed Software Reconfiguration: a Verification-Oriented Survey H. Hélène Coullon L. Ludovic Henrio F. Frédéric Loulergue S. Simon Robillard ACM Computing Surveys May 2023 56 1 1-37 Empowering Digital Twin for Future Networks with Graph Neural Networks: Overview, Enabling Technologies, Challenges, and Opportunities D.-T. Duc-Thinh Ngo O. Ons Aouedi K. Kandaraj Piamrat T. Thomas Hassan P. Philippe Raipin-Parvédy Future internet November 2023 15 12 377 Federated learning for 5G base station traffic forecasting V. Vasileios Perifanis N. Nikolaos Pavlidis R.-A. Remous-Aris Koutsiamanis P. Pavlos Efraimidis Computer Networks August 2023 235 1389-1286 109950 Towards Scalable Resilient Federated Learning: A Fully Decentralised Approach D. Divi De Lacour M. Marc Lacoste M. Mario Südholt J. Jacques Traoré PeRConAI 2023: 2nd IEEE Workshop on Pervasive and Resource-constrained Artificial Intelligence Atlanta (GA), United States 2023 IEEE SeMaFoR - Self-Management of Fog Resources with Collaborative Decentralized Controllers A. Abdelghani Alidra H. Hugo Bruneliere H. Hélène Coullon T. Thomas Ledoux C. Charles Prud'Homme J. Jonathan Lejeune P. Pierre Sens J. Julien Sopena J. Jonathan Rivalan SEAMS 2023 - IEEE/ACM 18th Symposium on Software Engineering for Adaptive and Self-Managing Systems Melbourne, Australia May 2023 IEEE 25-31 HFedSNN: Efficient Hierarchical Federated Learning using Spiking Neural Networks O. Ons Aouedi K. Kandaraj Piamrat M. Mario Südholt MobiWac 2023: 21st ACM International Symposium on Mobility Management and Wireless Access Montréal, Canada September 2023 VeriFog: A Generic Model-based Approach for Verifying Fog Systems at Design Time H. Hiba Awad A. Abdelghani Alidra H. Hugo Bruneliere T. Thomas Ledoux E. Etienne Leclerq J. Jonathan Rivalan The 39th ACM/SIGAPP Symposium on Applied Computing (SAC ’24) Avila, Spain April 2024 Dynamic Adaptation of Urgent Applications in the Edge-to-Cloud Continuum D. Daniel Balouek H. Hélène Coullon International European Conference on Parallel and Distributed Computing (Euro-Par'2023) Limassol, Cyprus August 2023 Keynote Talk: Leveraging the Edge-Cloud Continuum to Manage the Impact of Wildfires on Air Quality D. Daniel Balouek I. Ismael Perez S. D. Sam D Faulstich H. A. Heather A Holmes I. Ilkay Altintas M. Manish Parashar 2023 IEEE International Conference on Pervasive Computing and Communications Workshops and other Affiliated Events (PerCom Workshops) Atlanta, France March 2023 IEEE 27-31 Mobile Edge Slice Broker: Mobile Edge Slices Deployment in Multi-Cloud Environments S. Samia Boutalbi N. Nizar Kheir R. A. Remous Aris Koutsiamanis M. Mario Südholt Y. Yann Dan Moussa ICFEC 2023 - 7th IEEE International Conference on Fog and Edge Computing Bangalore, India 2023 1-6 CLOUDFACTORY: An Open Toolkit to Generate Production-like Workloads for Cloud Infrastructures P. Pierre Jacquet T. Thomas Ledoux R. Romain Rouvoy IC2E 2023 - 11th IEEE International Conference on Cloud Engineering Boston, Massachusetts, USA, United States September 2023 IEEE 1-11 Leveraging Relay Nodes to Deploy and Update Services in a CPS with Sleeping Nodes A. Antoine Omond H. Hélène Coullon I. Issam Raïs O. Otto Anshus CPSCom 2023 - 16th IEEE International Conference on Cyber, Physical and Social Computing Hainan Island, China December 2023 IEEE 1-8 Evaluating the energy consumption of adaptation tasks for a CPS in the Arctic Tundra A. Antoine Omond I. Issam Raïs H. Hélène Coullon GreenCom 2023 - 9th IEEE International Conference on Green Computing and Communications Hainan Island, China 2023 IEEE 1-8 Towards Energy-Aware Federated Traffic Prediction for Cellular Networks V. Vasileios Perifanis N. Nikolaos Pavlidis S. Selim Yilmaz F. Francesc Wilhelmi E. Elia Guerra M. Marco Miozzo P. Pavlos Efraimidis P. Paolo Dini R.-A. Remous-Aris Koutsiamanis FMEC 2023 - Eighth International Conference on Fog and Mobile Edge Computing (FMEC) Tartu, Estonia 2023 IEEE 93-100 An urgent computing oriented architecture for dynamic climate risk management framework S. Shahrzad Pour D. Daniel Balouek A. Amir Masoumi M. Martin Brynskov International Workshop on Intelligent and Adaptive Edge-Cloud Operations and Services (Intel4EC 2023) Taormina, Italy December 2021 Event-Driven FaaS Workflows for Enabling IoT Data Processing at the Cloud Edge Continuum C. Christian Sicari D. Daniel Balouek M. Massimo Villari M. Manish Parashar International Conference on Utility and Cloud Computing (UCC 2023) Taormina, Italy 2023 TEMA: Event Driven Serverless Workflows Platform for Natural Disaster Management C. Christian Sicari A. Alessio Catalfamo L. Lorenzo Carnevale A. Antonino Galletta D. Daniel Balouek M. Manish Parashar M. Massimo Villari 2023 IEEE Symposium on Computers and Communications (ISCC) Gammarth, France July 2023 IEEE 1-6 Articulating Data and Control Planes for the Composition and Synchronization of Digital Twins D. F. Dan Freeman Mahoro T. Thomas Ledoux T. Thomas Hassan T. Thierry Coupaye Midd4DT 2023 - 1st International Workshop on Middleware for Digital Twin Bologna (ITALY), Italy December 2023 ACM 13-18 Cheops, a service-mesh to geo-distribute micro-service applications at the Edge M. Marie Delavergne March 2023 Secure distributed workflows for biomedical data analytics W. Wilmer Garzón-Alfonso August 2023 The LIFE of CUE E. Eloi Perdereau J. Jacques Noyé October 2023 La chasse au gaspillage dans le cloud et les data centers P. Pierre Jacquet R. Romain Rouvoy T. Thomas Ledoux The Conversation January 2023 Handling partially labeled network data: a semi-supervised approach using stacked sparse autoencoder O. Ons Aouedi K. Kandaraj Piamrat D. Dhruvjyoti Bagadthey Computer Networks 2022 Federated Semi-Supervised Learning for Attack Detection in Industrial Internet of Things O. Ons Aouedi K. Kandaraj Piamrat G. Guillaume Muller K. Kamal Singh IEEE Transactions on Industrial Informatics 2022 Coordinated autonomic loops for target identification, load and error-aware Device Management for the IoT N. Neil Ayeb É. Éric Rutten S. Sébastien Bolle T. Thierry Coupaye M. Marc Douet Annals of Computer Science and Information Systems 2020 21 491--500 Towards an Autonomic and Distributed Device Management for the Internet of Things N. Neil Ayeb É. Éric Rutten S. Sébastien Bolle T. Thierry Coupaye M. Marc Douet 2019 IEEE 246--248 Semi-automatic RDFization Using Automatically Generated Mappings N. Noorani Bakerally C. Cyrille Bareau F. Fabrice Blache S. Sébastien Bolle C. Christelle Ecrepont P. Pauline Folz N. Nathalie Hernandez T. Thierry Monteil G. Gilles Privat F. Fano Ramparany Lecture Notes in Computer Science 2020 Springer 12124 25--31 Adding virtualization capabilities to the Grid’5000 testbed D. Daniel Balouek A. C. Alexandra Carpen Amarie G. Ghislain Charrier F. Frédéric Desprez E. Emmanuel Jeannot E. Emmanuel Jeanvoine A. Adrien Lèbre D. David Margery N. Nicolas Niclausse L. Lucas Nussbaum others 2012 3--20 SNAPPY: Programmable Kernel-Level Policies for Containers M. Maxime Belair S. Sylvie Laniepce J.-M. Jean-Marc Menaud March 2021 1636--1645 Component-based architecture: the Fractal initiative G. Gordon Blair T. Thierry Coupaye J.-B. Jean-Bernard Stefani Annals of telecommunications February 2009 64 1 1--4 A Model-based Architecture for Autonomic and Heterogeneous Cloud Systems H. Hugo Bruneliere Z. Zakarea Al-Shara F. Frederico Alvares J. Jonathan Lejeune T. Thomas Ledoux March 2018 1 201-212 Defragmenting the 6LoWPAN Fragmentation Landscape: A Performance Evaluation A. Amaury Bruniaux R.-A. Remous-Aris Koutsiamanis G. Z. Georgios Z Papadopoulos N. Nicolas Montavont Sensors 2021 21 5 1711 Madeus: a formal deployment model M. Maverick Chardet H. Hélène Coullon D. Dimitri Pertin C. Christian Pérez 2018 724--731 Toward Safe and Efficient Reconfiguration with Concerto M. Maverick Chardet H. Hélène Coullon S. Simon Robillard Science of Computer Programming March 2021 203 1-31 Scheduling UWB Ranging and Backbone Communications in a Pure Wireless Indoor Positioning System M. Maximilien Charlier R.-A. Remous-Aris Koutsiamanis B. Bruno Quoitin IoT 2022 3 1 219--258 EnosLib: A Library for Experiment-Driven Research in Distributed Computing R.-A. Ronan-Alexandre Cherrueau M. Marie Delavergne A. Alexandre van Kempen A. Adrien Lebre D. Dimitri Pertin J. Javier Rojas Balderrama A. Anthony Simonet M. Matthieu Simonin IEEE Transactions on Parallel and Distributed Systems June 2022 33 6 1464-1477 Geo-Distribute Cloud Applications at the Edge R.-A. Ronan-Alexandre Cherrueau M. Marie Delavergne A. Adrien Lebre August 2021 1-14 Toward a Holistic Framework for Conducting Scientific Evaluations of OpenStack R.-A. Ronan-Alexandre Cherrueau D. Dimitri Pertin A. Anthony Simonet A. Adrien Lebre M. Matthieu Simonin 2017 544--548 Extensibility and Composability of a Multi-Stencil Domain Specific Framework H. Hélène Coullon J. Julien Bigot C. Christian Pérez International Journal of Parallel Programming November 2017 A graph-based cross-vertical digital twin platform for complex cyber-physical systems T. Thierry Coupaye S. Sébastien Bolle S. Sylvie Derrien P. Pauline Folz P. Pierre Meye G. Gilles Privat P. R. Philippe Raipin Parvedy 2022 Springer, Cham Fractal Component-Based Software Engineering T. Thierry Coupaye J.-B. Jean-Bernard Stefani Lecture Notes in Computer Science 2006 Springer 4379 117--129 FPath and FScript: Language support for navigation and reliable reconfiguration of Fractal architectures P.-C. Pierre-Charles David T. Thomas Ledoux M. Marc Léger T. Thierry Coupaye annals of telecommunications - annales des télécommunications February 2009 64 1 45--63 A framework for the coordination of multiple autonomic managers in cloud environments F. A. Frederico Alvares De Oliveira T. Thomas Ledoux R. Rémi Sharrock 2013 179--188 Reliable self-deployment of cloud applications X. Xavier Etchevers G. Gwen Salaün F. Fabienne Boyer T. Thierry Coupaye N. D. Noël De Palma 2014 ACM 1331--1338 Reliable self-deployment of distributed cloud applications X. Xavier Etchevers G. Gwen Salaün F. Fabienne Boyer T. Thierry Coupaye N. D. Noel De Palma Softw. Pract. Exp. 2017 47 1 3--20 Cloud Edge Computing: Beyond the Data Center (White Paper) T. O. The OpenStack Foundation January 2018 Edge Computing: Next Steps in Architecture, Design and Testing) T. O. The OpenStack Foundation January 2020 EScala: Modular Event-Driven Object Interactions in Scala V. Vaidas Gasiūnas L. Lucas Satabin M. Mira Mezini A. Angel Núñez J. Jacques Noyé Inverse Space Filling Curve Partitioning Applied to Wide Area Graphs C. Cyprien Gottstein P. R. Philippe Raipin Parvedy M. Michel Hurfin T. Thomas Hassan T. Thierry Coupaye November 2020 223-241 Partitioning Wide Area Graphs Using a Space Filling Curve C. Cyprien Gottstein P. R. Philippe Raipin Parvedy M. Michel Hurfin T. Thomas Hassan T. Thierry Coupaye International Journal of Data Mining & Knowledge Management Process January 2021 11 1 13-31 Thorough performance evaluation & analysis of the 6TiSCH minimal scheduling function (MSF) D. David Hauweele R.-A. Remous-Aris Koutsiamanis B. Bruno Quoitin G. Z. Georgios Z Papadopoulos Journal of Signal Processing Systems 2022 94 1 3--25 Entropy: a consolidation manager for clusters F. Fabien Hermenier X. Xavier Lorca J.-M. Jean-Marc Menaud G. Gilles Muller J. Julia Lawall 2009 41--50 ODeSe: On-Demand Selection for multi-path RPL networks T. L. Tomas Lagos Jenschke R.-A. Remous-Aris Koutsiamanis G. Z. Georgios Z Papadopoulos N. Nicolas Montavont Ad Hoc Networks 2021 114 102431 The vision of autonomic computing J. J.O. Kephart D. D.M. Chess Computer 2003 36 1 41-50 A Centralized Controller for Reliable and Available Wireless Schedules in Industrial Networks R.-A. Remous-Aris Koutsiamanis G. Z. Georgios Z Papadopoulos B. Bruno Quoitin N. Nicolas Montavont 2020 1--9 Putting the Next 500 VM Placement Algorithms to the Acid Test A. Adrien Lebre J. Jonathan Pastor A. Anthony Simonet M. Mario Südholt IEEE Transactions on Parallel and Distributed Systems 2018 Reliable Dynamic Reconfigurations in a Reflective Component Model M. Marc Léger T. Thomas Ledoux T. Thierry Coupaye CBSE'10 Prague, Czech Republic 2010 Springer-Verlag 74--92 Towards a Generic Autonomic Model to Manage Cloud Services J. Jonathan Lejeune F. Frederico Alvares T. Thomas Ledoux 2017 SciTePress 175-186 Architectural Model and Planification Algorithm for the Self-Management of Elastic Cloud Applications L. Loic Letondeur X. Xavier Etchevers T. Thierry Coupaye F. Fabienne Boyer N. D. Noel De Palma 2014 IEEE Computer Society 172--179 Digital Twin-Driven Approach for Smart City Logistics: The Case of Freight Parking Management Y. Yu Liu P. Pauline Folz S. Shenle Pan F. Fano Ramparany S. Sébastien Bolle E. Eric Ballot T. Thierry Coupaye IFIP Advances in Information and Communication Technology 2021 Springer 633 237--246 WoT Graph as Multiscale Digital-Twin for Cyber-Physical Systems-of-Systems G. Gilles Privat T. Thierry Coupaye S. Sébastien Bolle P. Philippe Raipin Parvedy 06 2019 Cooperative and reactive scheduling in large-scale virtualized platforms with DVMS F. Flavien Quesnel A. Adrien Lebre M. Mario Südholt Concurrency and Computation: Practice and Experience 2013 25 12 1643--1655 Exploiting Data Analytics for Home Automation Services F. Fano Ramparany M. Marceau Thalgott S. Sébastien Bolle S. Serge Martin 2016 IEEE Computer Society 228--234 An overview of service placement problem in fog and edge computing F. A. Farah A\"it Salaht F. Frédéric Desprez A. Adrien Lebre ACM Computing Surveys (CSUR) 2020 53 3 1--35 Service placement in fog computing using constraint programming F. A. Farah Ait Salaht F. Frédéric Desprez A. Adrien Lebre C. Charles Prud'Homme M. Mohamed Abderrahim 2019 19--27 The Emergence of Edge Computing M. M. Satyanarayanan Computer January 2017 50 1 30-39 SLA guarantees for cloud services D. Damian Serrano S. Sara Bouchenak Y. Yousri Kouki F. A. Frederico Alvares de Oliveira Jr. T. Thomas Ledoux J. Jonathan Lejeune J. Julien Sopena L. Luciana Arantes P. Pierre Sens Future Generation Computer Systems 2016 54 Supplement C 233--246 Edge computing: Vision and challenges W. Weisong Shi J. Jie Cao Q. Quan Zhang Y. Youhuizi Li L. Lanyu Xu IEEE Internet of Things Journal 2016 3 5 637--646 JEScala: Modular Coordination with Declarative Events and Joins J. M. Jurgen M. Van Ham G. Guido Salvaneschi M. Mira Mezini J. Jacques Noyé 205--216 Combining heuristics to optimize and scale the placement of iot applications in the fog Y. Ye Xia X. Xavier Etchevers L. Loic Letondeur A. Adrien Lebre T. Thierry Coupaye F. Frédéric Desprez 2018 153--163 Extend Cloud to Edge with KubeEdge Y. Ying Xiong Y. Yulin Sun L. Li Xing Y. Ying Huang 2018 373-377