Book Image

Apache Spark in 7 Days [Video]

By : Karen Yang
Book Image

Apache Spark in 7 Days [Video]

By: Karen Yang

Overview of this book

If you’re looking to get up to speed with learning the fundamentals of Apache Spark in a short period of time, you can count on this course to help you learn the basics of this engine. Spark is becoming a popular big data processing engine with its unique ability to run in-memory with excellent speed. It is also easy to use and offers simple syntax. The course is designed to give you a fundamental understanding of and hands-on experience in writing basic code as well as running applications on a Spark cluster. Over 7 days, you will work on interesting examples and assignments that will demonstrate and help you understand basic operations, querying, machine learning, and streaming. By the end of this course, you’ll be able to put your learning to practice and build your own projects with ease and confidence. The code bundle for this video course is available at - https://github.com/PacktPublishing/Apache-Spark-in-7-Days
Table of Contents (7 chapters)
Chapter 2
Working with RDDs
Content Locked
Section 4
Joins, Set, and Numeric Operations
The aim of this video is to review RDD operations such as joins, set, and numeric operations. - Review inner, left outer, right outer, and full outer joins - Review set operations such as intersection, subtraction, union, and distinct - Review numeric operations such as minimum, maximum, mean, sum, standard deviation, variance, and statistics