Rate this book

Introducing MLOps: How to Scale Machine Learning in the Enterprise

Name: Introducing MLOps: How to Scale Machine Learning in the Enterprise
Rating: 3.47 (21 reviews)
ISBN: 9781492083290

Mark Treveil, CL Stenac, L Dreyfus-Schmidt

Rate this book

More than half of the analytics and machine learning (ML) models created by organizations today never make it into production. Some of the challenges and barriers to operationalization are technical, but others are organizational. Either way, the bottom line is that models not in production can't provide business impact.

This book introduces the key concepts of MLOps to help data scientists and application engineers not only operationalize ML models to drive real business change but also maintain and improve those models over time. Through lessons based on numerous MLOps applications around the world, nine experts in machine learning provide insights into the five steps of the model life cycle--Build, Preproduction, Deployment, Monitoring, and Governance--uncovering how robust MLOps processes can be infused throughout.

This book helps you:

Fulfill data science value by reducing friction throughout ML pipelines and workflows
Refine ML models through retraining, periodic tuning, and complete remodeling to ensure long-term accuracy
Design the MLOps life cycle to minimize organizational risks with models that are unbiased, fair, and explainable
Operationalize ML models for pipeline deployment and for external business systems that are more complex and less standardized

GenresTechnicalTechnologyNonfictionComputer ScienceArtificial IntelligenceProgramming

183 pages, Paperback

Published January 5, 2021

53 people are currently reading

239 people want to read

About the author

Mark Treveil

3 books

What do you think?

Rate this book

Friends & Following

Create a free account to discover what your friends think of this book!

Community Reviews

5 stars

24 (18%)

4 stars

40 (30%)

3 stars

43 (32%)

2 stars

21 (16%)

1 star

3 (2%)

Displaying 1 - 21 of 21 reviews

danielle; ▵

428 reviews1 follower

September 29, 2021

MLOps is boring.

Instead of reading this book, think about your life and the choices you have made. How did you get here? Are you happy?

Miguel

106 reviews6 followers

May 22, 2022

I'm learning mlops now and I thought this was a pretty good intro. There were some parts I already knew which I skimmed over but I'd recommend it if you're looking for an overview

software-engineering

Giulio Ciacchini

381 reviews14 followers

December 7, 2022

A very insightful introduction to the realm of MLops.
This is a very young component of data science projects, but it is very exciting and has a lot of potential.
The key is already in the name which resembles DevOps, which streamlines the practice of software changes and updates.
Long (should be) gone the times when does the scientist tested and developed machine learning model on their machines, nowadays we should have a robust and organised way to deploy new models.
The core concepts borrowed from DevOps are: robust automation, collaboration between teams, and to end service life cycle, prioritising continuous delivery and high-quality.
However the fundamental difference between MLOps and DevOps is that deploying software code into production is different than deploying machine learning models into production not the least because data is always changing and the models are constantly learning and adapting to new inputs.

Scrolling through other users' reviews I was surprised to find some of them complaining of lack of detail or deep dive, like this review: "Very general introduction to the subject with a nice overview of the field but not much technical depth."
While this is certainly true, one should understand from the title and length of the book, the purpose of the text itself.
This is an introduction to the field of machine learning operations and I have to say that it does a great job at that.
If anyone would expect to find an advanced workbook on just this topic, it's not the book's fault it's the reader's expectation that are set wrongly.

NOTES
MLOps is the standardization and streamlining of machine learning life cycle management.
Actors: Data scientists; data engineers; software engineers; DevOps; ML architects.
Machine learning is the science of computer algorithms that automatically learn and improve from experience rather than being explicitly programmed.

The model can be deployed as model-as-a-service (live-scoring model), deployed into a simple framework to provide a rest API endpoint that a response to requests in real time or embedded model, packaged into an application which is then published.
Data Scientists’ Concerns: how soon the model should be retrained when model’s performance is degrading: check Ground truth or Input drifts

Machine learning model is a projection of reality, it’s a partial and approximate representation of some aspects of a real thing or process. Once trained it boils down in mathematical formula that yields a result when fed some inputs.

ML model risk: bugs are in the runtime framework; low quality of training data; high difference between production data and training data; misinterpretation of the outputs;

Reproducibility involves the ability to easily rerun the exact same experiment. The model should come with detailed documentation, the data used for training and testing, and with an artefact that bundles the implementation of the model the last day for the specification of the environment it was run in.

Continuous integration and continuous delivery CI/CD pipelines are the modern Philosophy of agile software development: build the model, deploy with this environment, deploy to production environment.
The goal is to avoid unnecessary effort emerging the work from several contributors as well as to detect the bugs or development conflicts as soon as possible.
The most common version control system is Git, but he was not designed to restore other types of assets common into the science workflows, such as larger binary files or data itself.
ML artifact is a testable and deployable bundle of the project with all these elements:
- code for the model and its preprocessing
- hyperparameters and configuration
- training and validation data
- trained model in each runnable form
- an environment including libraries with specific versions
- documentation
- code and data for testing scenarios

Deployment strategies: integration; delivery; deployment; release.
Deployment categories: better scoring, where the whole data sets are processed using a model, such as in daily scheduled jobs or real time scoring, where one or small number of records are scored.
Maintenance in production: Resource monitoring; health check; ML metrics monitoring.

Containerization Technology allows to deploy to production automatically and reliably rebuild this environment on the target machine. Unlike virtual machines, containers do not duplicate the complete operating system; multiple containers share a common operating system and therefore are far more resort efficient.
The most famous Docker allows an application to be packaged sent to a server (the Docker Host) and run with all its dependencies in isolation from other applications.
Kubernetes is the standard for container orchestration, it provides a powerful declarative API to run applications in a group of Docker hosts, called a Kubernetes cluster.
It is a declarative language because rather than trying to express in code the steps to set up, monitor, upgrade the, stop, and connect the container, users specify in a configuration file the desire state and Kubernetes makes it happen and then maintains it.

Scaling Deployments:
- the ability to use the model in production with the high scale data
- The ability to train larger and larger numbers of models
Hindering more days off we all-time scoring he’s made it much easier by frameworks such as Kubernetes. Since most of the time trained models are essentially formulas they can be replicated in the cluster in as many copies as necessary. With auto-scaling features in Kubernetes, both provisioning the new machines and load balancing are fully handled by the framework.

A computational system is said to be horizontally scalable if it is possible to incrementally add more machines to expand its processing power.
An elastic system allows easy addition and removal of resources to match the compute requirements.
Spark handles distributed computation natively that can split the data in the competition among its nuts. Kubernetes orchestrates containers, nut it is not aware of what the containers are actually doing. To run a spark a job, the desired number of spark containers are started by Kubernetes, once they have started they can communicate to complete the computation after which the containers are destroyed and their resources are available for other applications.
Another way to distribute batch processing is to partition the data.
In terms of computation scaling the number of models is somewhat simpler.

Machine learning models need to be monitored at two levels:
- At the resource level including ensuring the model is running correctly in the production environment
- At the performance level meaning monitoring the pettiness of the model of time, attempting to track degradation and triggering retraining.

How often should the model be retrained depends on their domain, because the, the model before months.
Online learning, that is algorithms that can train themselves iteratively, are attractive but more costly to set up

Understanding Model degradation
- Ground truth evaluation: retraining requires waiting for the label event, with the new ground truth collected we can compute the performance of the model based on it and compare it with the registered metrics in the training phase. The main advantage is that it is domain agnostic. However it can be very costly, it can be not always immediately available or partially available.
- Input Drifts: if the data distribution diverges between the training and testing phases it is a strong signal that the model performance won’t be the same. The main causes are sample selection bias where the training sample is not to be present of the of the population and non-stationary environment, where training data collected from the Swiss population does not represent the target population.
There are two main methods to detect thrift: univariate statistical tests order domain classifier, they both bind to the importance of features to explain drift.

The feedback loop: information from the production environment flows back to the model prototyping environment for further improvements.
The infrastructure has three main components:
- Logging, event log that records metadata, inputs, outputs, system action, model explanation
- Model evaluation, designed to improve the model by retraining it. The model itself is not a static object, it constantly changes with time these changes fall below the label logical model. Which is a collection of model templates and data versions that aims to solve a business problem. Model evaluation stores are structures that centralise the data related to model life-cycles to allow comparison
- Online evaluation,
o champion/challenger a.k.a. shadow testing, deploys one or several additional models to the production environment, they will receive and score the same income and get requests as the active model (the champion). In this way we can verify that the performance of the new models is better than or at least as good as the old model and we can measure how the new models handle realistic load.
o A/B testing, it should be used only when the shadow testing is not available. This can happen when the ground truth cannot be evaluated for both models or when the objective to optimize is only indirectly related to the performance of the prediction. With A/B testing the candidate model returns prediction for certain requests and the regional model handles the other requests. Once the test period is over, statistical tests compared the performance of the two models and teams can make a decision based on the statistical significance of those tests.

coding non-fiction

André

4 reviews1 follower

September 16, 2021

Very general introduction to the topic with some nice overview of the field but without much technical depth

Charluff

100 reviews3 followers

November 1, 2020

As the context gets more complex with specific cloud-technologies for each part of the ml process, while business demands a lower time-to-market; decoupling, standardization and teamwork becomes crucial to build robust ML solutions.

As ML gets consumed via API endpoints (following microservices architectures), it is important to start considering your models as softwares. Therefor, DevOps for ML.

This book provides a good guide to implement a structure on your ML deliveries and processes. I’m currently using it as I implement the ML framework at one of my customers.

The bigger the team and projects, the more you’ll benefit from this approach.

Willow Turner

7 reviews2 followers

January 12, 2021

Describes high level problems but proposes few concrete solutions. No real guidance on how to scale ML. Possibly worthwhile as an overview for technical decision makers outside ML? But not useful for ML Engineers.

Henrique

16 reviews

January 10, 2021

A brief in-depth introduction to MLops and machine learning lifecycle

Alex Ott

Author 3 books207 followers

December 31, 2021

good intro into MLOps - I'm not new to it, but some aspects were well formulated

ir-dm-nlp-ml-search

Charles

26 reviews

November 3, 2025

Well, this is just exactly as it is titled: an introductory discussion to MLOps and its importance. If you are looking for discussion of particular technologies, examples suggesting technology stack or something more advanced, you will not find it here.

I think the best audience for this book is people who are not really in Machine Learning Engineering or even Software Engineering, more like a introductory splash for managers. That is not to say that people in ML Engineering or any other Engineering field might not find something useful here, it is indeed a good overview of the field before you move on to more advanced books on the topic.

If you wanted to know more about platforms such as MLFLow, ClearML, Kubeflow or even Airflow, Kubernets and Pytorch, you will not find a single mention here.

ai-ml

Bibis Bücher

80 reviews

December 29, 2022

Eines der wenigen Fachbücher, dass sich mit MLOps im Allgemeinen beschäftigt.
Das Buch hat einen guten strukturierten Aufbau und die einzelnen Prozessphasen werden gut beschrieben.
Kenntnisse aus dem Bereich der KI sollte man allerdings bereits mitbringen.
Das Buch ist aus einer fachlichen Perspektive geschrieben und richtet sich eher an die Managementebene und weniger an Entwickler. Aber auch Entwicklern kann dieses Prozesswissen nützen.

Alejandro

9 reviews

December 22, 2024

This book attempts to be a tremendously shallow introduction to the world of MLOps.

For people with some technical and data background, I would discourage to read it. I can just say that I was a bit disappointed with the deep of the chapters and the examples shown throughout the book.

If you are a busy executive or 100% new to this field, I guess it is not a bad idea to read it.

Abinav

77 reviews2 followers

November 9, 2021

Basic introduction to the world of MLops, Suitable for people wishing to get into this field or know nothing about. Not a lot of implementation details present. A good starter book

Jochem Grietens

1 review1 follower

February 6, 2022

Great fundamentals book on ML-OPS

Özgür

130 reviews3 followers

March 26, 2022

This is a giant white paper, does not have much technical depth.

Mirina Gonzales

6 reviews

February 3, 2024

Si quieres dar un vistazo a lo que es MLOps te recomiendo este libro, de forma teorica recorreras los conceptos necesarios para entender el tema

Ian Price

2 reviews

July 23, 2024

Too much of a skim, no actionable items. All 'correct', but nothing really useful.

Monika Venckauskaite

30 reviews1 follower

January 10, 2021

If you have experience deploying machine learning models into production, this book will help you to connect all the dots within your organization. It suggests the best practices and processes for ML deployment and presents you with approaches to common problems, such as credit risk management, recommendation engines and grid optimisation. It is written in high level language and is aimed to help you to build fast, efficient and reliable MLOps.

Vaidas

119 reviews4 followers

March 2, 2021

`Enterprise` is probably the keyword everyone should look at before reading this. This is not a book for data scientists/ML engineers or even their managers. This book is probably for someone quite far removed from actual modeling work. But I haven't got enough experience with corporate ML implementations to figure out the exact audience.
I think this might be a lot better if reduced to one-fourth of its size and published as a white paper by the Dataiku Team.

80 reviews

50 reviews

The only redeemable part of the book is the three real-world examples in Part III. The steps described in Part II are just a generalization of the whole process. It feels like the author was in a rush and skipped some of the details which could be discussed more.