Gain the key language concepts and programming techniques of Scala in the context of big data analytics and Apache Spark. The book begins by introducing you to Scala and establishes a firm contextual understanding of why you should learn this language, how it stands in comparison to Java, and how Scala is related to Apache Spark for big data analytics. Next, you’ll set up the Scala environment ready for examining your first Scala programs. This is followed by sections on Scala fundamentals including mutable/immutable variables, the type hierarchy system, control flow expressions and code blocks. The author discusses functions at length and highlights a number of associated concepts such as functional programming and anonymous functions. The book then delves deeper into Scala’s powerful collections system because many of Apache Spark’s APIs bear a strong resemblance to Scala collections. Along the way you’ll see the development life cycle of a Scala program. This involves compiling and building programs using the industry-standard Scala Build Tool (SBT). You’ll cover guidelines related to dependency management using SBT as this is critical for building large Apache Spark applications. Scala Programming for Big Data Analytics concludes by demonstrating how you can make use of the concepts to write programs that run on the Apache Spark framework. These programs will provide distributed and parallel computing, which is critical for big data analytics. What You Will Learn Who This Book Is For Data scientists, data analysts and data engineers who intend to use Apache Spark for large-scale analytics.
Scala Programming for Big Data Analytics by Irfan Elahi is a great intro to the Scala programming language, especially if you are coming from a data science related language like Python.
The book introduces all major language aspects such as loops, control sequences, function definitions, and classes. Also, the big data engine Apache Spark is mentioned in the end.
If you have no knowledge about Scala and want to learn the basics quickly, you need to pick up this book. You will learn all the essential information to get started quickly. The author introduces the different topics with very simple and easy to follow examples. It abstains from abstract and unrealistic toy examples like Fibonacci sequences. Also, the author explains the entire development cycle of the Scala language, including writing code, compiling your package, and building jar files. Also, the book is written in a professional style and I didn't catch any typos or bad formatting.
With its approximately 300 pages (and a large font), the book needs to compress a lot of information on very few pages. Hence, you will not learn everything the language has to offer. I picked this book because I found Programming Scala too complex after covering the basics, hopeing to find a clearer demonstration in Scala Programming for Big Data Analytics. However, the author doesn't cover more advanced functional programming techniques like tail recursion. Also, I was hopeing to find more information on the Spark Scala api.
Overall, it is a very good book and it should be your first to read if you want to learn Scala and have a data science background.