Jump to ratings and reviews
Rate this book

Building Machine Learning Pipelines: Automating Model Life Cycles with TensorFlow

Rate this book
Companies are spending billions on machine learning projects, but it's money wasted if the models can't be deployed effectively. In this practical guide, Hannes Hapke and Catherine Nelson walk you through the steps of automating a machine learning pipeline using the TensorFlow ecosystem. You'll learn the techniques and tools that will cut deployment time from days to minutes, so that you can focus on developing new models rather than maintaining legacy systems.

Data scientists, machine learning engineers, and DevOps engineers will discover how to go beyond model development to successfully productize their data science projects, while managers will better understand the role they play in helping to accelerate these projects. 

364 pages, Paperback

Published August 18, 2020

45 people are currently reading
159 people want to read

About the author

Hannes Hapke

8 books2 followers

Ratings & Reviews

What do you think?
Rate this book

Friends & Following

Create a free account to discover what your friends think of this book!

Community Reviews

5 stars
7 (19%)
4 stars
12 (33%)
3 stars
10 (27%)
2 stars
6 (16%)
1 star
1 (2%)
Displaying 1 - 5 of 5 reviews
Profile Image for Andrey.
169 reviews1 follower
April 17, 2022
This book is 'meh'. Just go to the Tensorflow Extended site and follow the docs & tutorials.
Profile Image for Xianshun Chen.
88 reviews2 followers
February 8, 2021
pretty disappointing, not much value for me personally, most of the features in such a pipeline we already automate on our own, we have even more specific and complex use cases. what's worse, the TFX changes so much since the book's publication. Further more, lack of documentation, lack of working codes. The only i got away is some inspiration on the data shift and data validation as well as versioning concept.
Profile Image for Thang.
101 reviews13 followers
February 11, 2022
Some concepts I learnt from this book:
DATA VERSIONING: DVC and Pachyderm

DATA PREPROCESSING: should save the graph of preprocessing steps for future serving (avoid have 2 different processes which try to match with each other)

DISTRIBUTED TRAINING: synchronous and asynchronous training. Typically, synchronous strategies are coordinated via all-reduce operations and asynchronous strategies through a parameter server architecture.

MODEL SERVING
Traditional API Server is not good
- Lack of Code Separation: Separate API code and machine learning model
- Lack of Model Version Control
- Inefficient Model Inference: Batch Inference -> Better performance on GPU

MODEL OPTIMIZATION: Quantization, Prunning, and Model Distillation.

Profile Image for Mehdi.
23 reviews
August 15, 2020
A good book to learn about how to build a robust and scalable pipeline to train and maintain machine learning models at scale.
Profile Image for Phasathorn Suwansri.
16 reviews
March 19, 2021
I think it easy to get the overall concept. To learn more detail you need to get hands on. There are some tips to use tfx which is not specified in its documentation.
Displaying 1 - 5 of 5 reviews

Can't find what you're looking for?

Get help and learn more about the design.