Learn how easy it is to apply sophisticated statistical and machine learning methods to real-world problems when you build on top of the Google Cloud Platform (GCP). With this practical guide, author and GCP Program Manager Valliappa Lakshmanan shows you how to gain insight into a sample business decision by applying different statistical and machine learning methods and tools.
Along the way, you'll get an extensive tour of the big data and machine learning parts of GCP. You'll start with statistical methods, move into straightforward classification, and then explore windowing and real-time prediction.
Move from basic to increasingly sophisticated methods Understand interactive querying of very large datasets with BigQuery Learn about probabilistic decision making with SparkSQL and Spark Train a TensorFlow model in Python and call it from Java Create a data processing pipeline with Dataflow Compute time-windowed aggregates in real-time
A good overview of data science and machine learning techniques using 'big data' technologies on GCP; a good companion to the GCP Data Engineering courses on Coursera.
Somewhat of an unfair rating of two stars as I do not think I was among the intended audience for this book or at least didn't have the right expectations for this book. I have a background in software engineering, have used GCP for software engineering purposes, but do not have a data science background. To me, the book seemed like a mix of concepts, product descriptions, code snippets, and a single real-world example that, in mixing these, did not deliver an interesting, instructive message on any of the individual parts. It didn't really spend enough time at the conceptual level for me to feel like I understand the data science concepts any better. The command-line and code snippets didn't seem like useful knowledge as they are easily looked up in a reference and not "reusable" knowledge. I was also bored to death of the airline delays example by the end of the book :) I struggled to generalize the information in the book. Given my expectations, I likely would have been better off picking up a book on the introductory concepts of data science than this book.
As my employer prepares to move to GCP, I've been studying the platform's capabilities and getting excited about what it can do. The other GCP books I've read have covered the platform at a high level, discussing how the different services fit together. This book is much more applied, taking a concrete problem and working through a different aspect of it in each chapter.
My only critique of the book is that the example problem is straightforward enough that most of the firepower the author throws at it is overkill. An R script on a reasonably powerful laptop would have probably had only a slightly higher error rate.
While this is a great intro to some of the basics and offerings of GCP that can be leveraged for datascience, the book is targeting much more to explaining the pieces of the platform and getting up and running vs anything in depth. While the cloud native solutions such as cloud dataflow are touched on each could have its own book going through architecture integrations more in depth. Nonetheless a solid intro book.
Really bad book. Very disordered thoughts, very long paragraphs talking about being a Data Engineer or simple visualizations and little about basic fundamentals of GCP.
Lack of clarity, examples did not work properly. I was not able to even finish the book, I am still not quite sure what was the reason of this book, but the title does not relate to the reality.
A level-headed end to end process for data science and engineering in the cloud (not just Google Cloud). The author was a teammate of mine when joining the company and he should be very proud of this work.
Useful step-by-step guide to do a simple Data Science project on Google Cloud Platform, including where to get some initial public data to work with, how to create the components on Google Cloud Platform, how to analyze the results, and related things.