Book Image

Apache Spark 2 for Beginners

By : Rajanarayanan Thottuvaikkatumana
Book Image

Apache Spark 2 for Beginners

By: Rajanarayanan Thottuvaikkatumana

Overview of this book

<p>Spark is one of the most widely-used large-scale data processing engines and runs extremely fast. It is a framework that has tools that are equally useful for application developers as well as data scientists.</p> <p>This book starts with the fundamentals of Spark 2 and covers the core data processing framework and API, installation, and application development setup. Then the Spark programming model is introduced through real-world examples followed by Spark SQL programming with DataFrames. An introduction to SparkR is covered next. Later, we cover the charting and plotting features of Python in conjunction with Spark data processing. After that, we take a look at Spark's stream processing, machine learning, and graph processing libraries. The last chapter combines all the skills you learned from the preceding chapters to develop a real-world Spark application.</p> <p>By the end of this book, you will have all the knowledge you need to develop efficient large-scale applications using Apache Spark.</p>
Table of Contents (15 chapters)
Apache Spark 2 for Beginners
Credits
About the Author
About the Reviewer
www.PacktPub.com
Preface

Functional programming with Spark


The mutation of objects at run time, and the inability to get consistent results from a program or function because of the side effect that the program logic creates makes many applications very complex. If the functions in programming languages start behaving exactly like mathematical functions in such a way that the output of the function depends only on the inputs, that gives lots of predictability to applications. The computer programming paradigm giving lots of emphasis to the process of building such functions and other elements based on that, and using those functions just in the way that any other data types are being used, is popularly known as the functional programming paradigm. Out of the JVM-based programming languages, Scala is one of the most important ones that has very strong functional programming capabilities without losing any object orientation. Spark is written predominantly in Scala. Because of that itself, Spark has taken lots of...