MLOps builds upon the idea of DevOps. It is mainly about automation and requires a solid foundation based on mature tools and methods and builds upon interoperability and communication on several levels (technical, semantically, organizational).

This image has no alt text.
IML4E Innovations and their relation to MLOps Fraunhofer FOKUS

While basic MLOps techniques have now passed the first peak in the hype cycle, experts assume that MLOps will reach the plateau phase in 2 - 5 years. According to Gartner, especially “small and wide data and ModelOps, along with AI trust, risk and security management (AI TRiSM), are emerging trends that are being used to build a resilient enterprise”. Considering the state of MLOps worldwide, IML4E innovates in the following areas:

High quality and interoperable data preparation infrastructures for trustworthy ML  

The IML4E project will explore solutions based on real-life use cases providing hands-on experiences and challenging leading to novel research directions. IML4E will develop data engineering techniques, especially for edge AI and hybrid approaches, where the environment and tooling are different from the traditionally used cloud environments. Some examples of innovations include robust ways to deal with or detect problems in incoming IoT data; ways to detect concept drift in data; ways to assess the reliability of ML inferences via a combination of pre- and postprocessing of ML model data; extending the use of AutoML techniques for optimal pre-processing and cleaning of dirty data; and integration of the appropriate tool support for the ML pipelines. Overall, we expect IML4E to increase the level of automation in the area of data preparation by 30% and accelerate the related processes by 20%.  

Scalable MLOps techniques and tools for critical application domains 

The IML4E project will foster reuse and automation during all steps of the MLOps life cycle. For classical software, the development costs are only about 18% of software lifetime costs while maintenance (67%) and testing (15%) dominate. It is too early to say if these numbers apply exactly to ML solution, but their operation, monitoring, and maintenance is likely to need a lot of resources during the system lifetime. IML4E aims to cut those costs significantly. By measuring the number of service interruption during operation due to failures in ML-based smart software solutions, we expect a decrease of 10 to 20% of time of service interruption in comparable settings, we expect to reduce the time of service interruption by 10 to 20% in comparable constellations.   

IML4E aims to cut development costs of ML systems by novel ways to reuse ML models based on judicious use of transfer learning and AutoML approaches to (semi-)automatically find best models for each case. Based on this, we expect that the degree of artefact reuse can be increased by 35% for artefacts in the ML life cycle. In addition, we assume that by reusing ML artefacts, the number of errors in smart software can be reduced by 20%. In the longer term, we expect this to reduce the cost of ML-based smart software by 15% and an increase in time to market of 10 to 25%. Merging the MLOps and DevOps aspects in the productions of systems combining multiple ML and classical software modules is not yet well researched or developed. IML4E is also extending MLOps thinking from cloud-based solutions to edge AI with scalable solutions for managing a large number of deployments. Finally, optimizing the use of computational resources with advanced model engineering can lead into major saving in computing resources, which also has environmental contribution by cutting the electricity consumption of ML development. 

MLOps Methodology 

IML4E will be contributing to the skills and knowledge needed, because the new trend is that data scientists and software developer roles are increasingly merging for ML system development. It also aims to improve the expert efficiency in ML system development and operation resulting into more cost-efficient solutions and wider adoption of ML-based smart software solutions. Considering the industrial project partner and measuring the total number of smart software products and services that that are developed, we expect that the case study partner will increase the number of smart software products or smart service products by 1 to 3, new products and the technology provider by 1 to 2 new development products and services.   

An important aspect is considering the end-to-end operation of an ML solution and the requirements for the platforms to support continuous development and integration of ML-based solutions. The methodology will also consider the life-time support needed for ML systems which must be operational for multiple years (or even decades). Finally, the product family thinking for ML-based solutions requires new ways for scaling the development processes via reuse of models and software, and via easy creation of variants of development pipelines.   

Experimentation and training platform as well as pre standardization work  

IML4E will develop competencies for efficient, replicable, industrial-scale development and operation of ML systems which are vital for European industry to succeed in international competition. The project will deliver a large part of its experimental tools as open source, which allows the wider community to benefit from the project results. While US companies are dominant on the cloud-based ML platforms, especially the distributed and Edge AI, it will be an opportunity to benefit from Europe’s strong telecom competence. Moreover, because of privacy and regulatory reasons solutions are needed which can be used on customer’s own hardware for e.g. public administrators and for health care and other sensitive domains.