Book Image

Data Stream Development with Apache Spark, Kafka, and Spring Boot [Video]

By : Anghel Leonard
Book Image

Data Stream Development with Apache Spark, Kafka, and Spring Boot [Video]

By: Anghel Leonard

Overview of this book

Today, organizations have a difficult time working with huge numbers of datasets. In addition, data processing and analyzing need to be done in real time to gain insights. This is where data streaming comes in. As big data is no longer a niche topic, having the skillset to architect and develop robust data streaming pipelines is a must for all developers. In addition, they also need to think of the entire pipeline, including the trade-offs for every tier. This course starts by explaining the blueprint architecture for developing a completely functional data streaming pipeline and installing the technologies used. With the help of live coding sessions, you will get hands-on with architecting every tier of the pipeline. You will also handle specific issues encountered working with streaming data. You will input a live data stream of Meetup RSVPs that will be analyzed and displayed via Google Maps. By the end of the course, you will have built an efficient data streaming pipeline and will be able to analyze its various tiers, ensuring a continuous flow of data. All the code and supporting files for this course are available at https://github.com/PacktPublishing/-Data-Stream-Development-with-Apache-Spark-Kafka-and-Spring-Boot
Table of Contents (5 chapters)
Chapter 4
Implementing the Analysis Tier
Content Locked
Section 10
MLlib (Spark ML)
Spark MLlib (or Spark ML) is the Spark library for Machine Learning. The aim of this video is to discover all the main headlines of a Spark ML Pipeline. Implement an ML Pipeline for the House Price Forecast System discussed in the previous video. - Introduce Spark MLlib (Spark ML) main concept, Spark ML Pipeline, and see how data is flowing through an ML Pipeline - Cover Spark MLlib (Spark ML) operations: transformers, estimators, evaluators, etc. - Dissect Spark Pipeline and PipelineModel APIs and use them to Implement an ML Pipeline For The House Price Forecast System