Book Image

Apache Kafka Quick Start Guide

By : Raúl Estrada
Book Image

Apache Kafka Quick Start Guide

By: Raúl Estrada

Overview of this book

Apache Kafka is a great open source platform for handling your real-time data pipeline to ensure high-speed filtering and pattern matching on the ?y. In this book, you will learn how to use Apache Kafka for efficient processing of distributed applications and will get familiar with solving everyday problems in fast data and processing pipelines. This book focuses on programming rather than the configuration management of Kafka clusters or DevOps. It starts off with the installation and setting up the development environment, before quickly moving on to performing fundamental messaging operations such as validation and enrichment. Here you will learn about message composition with pure Kafka API and Kafka Streams. You will look into the transformation of messages in different formats, such asext, binary, XML, JSON, and AVRO. Next, you will learn how to expose the schemas contained in Kafka with the Schema Registry. You will then learn how to work with all relevant connectors with Kafka Connect. While working with Kafka Streams, you will perform various interesting operations on streams, such as windowing, joins, and aggregations. Finally, through KSQL, you will learn how to retrieve, insert, modify, and delete data streams, and how to manipulate watermarks and windows.
Table of Contents (10 chapters)

Preface

Since 2011, Kafka's been exploding in terms of growth. More than one third of Fortune 500 companies use Apache Kafka. These companies include travel companies, banks, insurance companies, and telecom companies.

Uber, Twitter, Netflix, Spotify, Blizzard, LinkedIn, Spotify, and PayPal process their messages with Apache Kafka every day.

Today, Apache Kafka is used to collect data, do real-time data analysis, and perform real-time data streaming. Kafka is also used to feed events to Complex Event Processing (CEP) architectures, is deployed in microservice architectures, and is implemented in Internet of Things (IoT) systems.

In the realm of streaming, there are several competitors to Kafka Streams, including Apache Spark, Apache Flink, Akka Streams, Apache Pulsar, and Apache Beam. They are all in competition to perform better than Kafka. However, Apache Kafka has one key advantage over them all: its ease of use. Kafka is easy to implement and maintain, and its learning curve is not very steep.

This book is a practical quick start guide. It is focused on showing practical examples and does not get involved in theoretical explanations or discussions of Kafka's architecture. This book is a compendium of hands-on recipes, solutions to everyday problems faced by those implementing Apache Kafka.