-
Book Overview & Buying
-
Table Of Contents
Practical Data Analysis - Second Edition
By :
In this chapter, we will present the main features of data processing architecture and the Cloudera platform distribution. Then, we will explore how to use a distributed filesystem and how to managing files from terminal and using a web interface. Finally, we will describe the use of Apache Spark, which is an open source, big data processing framework built with the goal of being fast and easy to use. Apache Spark provides us with a unified framework to manage big data processing requirements, such as data streaming, machine learning, and analytics.
In this chapter, we will cover these topics:
Since the first edition of this book in 2013, there has been big changes in the data-driven scene. With the emerge of buzzwords such as big data, data science, and deep...
Change the font size
Change margin width
Change background colour