Machine-Learning Architecture

urn:js:virtue:aspire:proposal:12.1

TL;DR

The aim of this proposal is get agreement that the Rendezvous Architecture will be the standard architecture design for Machine Learning Models.

Rational

Machine Learning (ML) is a type of Artificial Intelligence that enables systems with the aid of algorithms or models to learn from data, identify patterns and make decisions or predictions with minimal human intervention.

A single use-case for a ML Model will require many different models e.g Champion and Challenger Models and the constant requirement to retrain them on Production or Production-like data. With all these models we also need continuous benchmarking, monitoring and production hand-over.

Machine Learning Operations (MLOps) is concerned with deploying, managing, monitoring and governing ML Models in a Production environment in a scalable, efficient way.

For us to embrace MLOps we need an architecture for Machine Learning that can support MLOps. The Rendezvous Architecture is one solution that will enable this by stipulating the standards for the flow of data for hosted algorithms.

The aim of this proposal is get agreement that the Rendezvous Architecture will be the standard architecture design for Machine Learning Models.

Rendezvous Architecture for Machine Learning Models

Rendezvous Architecture

The Rendezvous Architecture proposed by Ted Dunning and Ellen Friedman is an architecture design allowing for the operation of multiple Machine Learning Models in a Production (Real) environment whilst still supporting existing Service Level Agreements and Business as Usual Processes.

Discrete Response System

Traditionally, ML Models are designed in the following way

All the information needed to make a decision is passed to the model which then responds with a decision synchronously.

The system has numerous issues

We cannot run and compare multiple models on the exact same data
Difficult to manage model lifecycles while still meeting Production SLA’s
We cannot easily guarantee system reliability
It’s difficult to address speed and stability requirements

The Rendezvous Architecture is an alternative to this system and attempts to address these issues.

Defining the Rendezvous Architecture

Input

Data is collected at scale as an Input Stream. Stream in this context refers to the data flow. Its technical implementation need not be a stream but could be a message bus or similar. It’s indicating a decoupling of the data collection from the model processing. Using Streams allows Consumers(Models) to pull data as required from the stream. Streams also provide a persistent store of all requests which can be distributed to multiple models in parallel.

Models

The Rendezvous Architecture supports the ability to run multiple models in parallel e.g.

Champion and Challenger Models (https://www.datarobot.com/blog/introducing-mlops-champion-challenger-models/)
Decoy (Archive Input without Results), Canary (Baseline) and Real Models (https://dzone.com/articles/deploying-machine-learning-and-ai-in-the-real-worl)

Scores

Model Scores are treated as an Input Stream for the Rendezvous Function. This decouples the response to the original request from the Model output

Rendezvous Function

The Rendezvous Function is responsible for selecting the model result with the highest relevance in accordance with the SLA and timeout policies. It subscribes to the original scoring requests to combine them with model scores from the model response stream.

Results

The most relevant result selected from the Rendezvous Function is provided as a response stream back to consumers

Advantages of the Rendezvous Architecture

Manage multiple models in parallel
Improved comparison between models and against a baseline model
Collect and preserve raw data at scale
Rapid deployment of new models
Decoupling of SLAs from the Model

Implications

None