Book Image

Apache Kafka Quick Start Guide

By : Raúl Estrada
Book Image

Apache Kafka Quick Start Guide

By: Raúl Estrada

Overview of this book

Apache Kafka is a great open source platform for handling your real-time data pipeline to ensure high-speed filtering and pattern matching on the ?y. In this book, you will learn how to use Apache Kafka for efficient processing of distributed applications and will get familiar with solving everyday problems in fast data and processing pipelines. This book focuses on programming rather than the configuration management of Kafka clusters or DevOps. It starts off with the installation and setting up the development environment, before quickly moving on to performing fundamental messaging operations such as validation and enrichment. Here you will learn about message composition with pure Kafka API and Kafka Streams. You will look into the transformation of messages in different formats, such asext, binary, XML, JSON, and AVRO. Next, you will learn how to expose the schemas contained in Kafka with the Schema Registry. You will then learn how to work with all relevant connectors with Kafka Connect. While working with Kafka Streams, you will perform various interesting operations on streams, such as windowing, joins, and aggregations. Finally, through KSQL, you will learn how to retrieve, insert, modify, and delete data streams, and how to manipulate watermarks and windows.
Table of Contents (10 chapters)

Event modeling

The first step in event modeling is to express the event in English in the following form:

Subject-verb-direct object

For this example, we are modeling the event customer consults the ETH price:

  • The subject in this sentence is customer, a noun in nominative case. The subject is the entity performing the action.
  • The verb in this sentence is consults; it describes the action performed by the subject.
  • The direct object in this sentence is ETH price. The object is the entity in which the action is being done.

We can represent our message in several message formats (covered in other sections of this book):

  • JavaScript Object Notation (JSON)
  • Apache Avro
  • Apache Thrift
  • Protocol Buffers

JSON is easily read and written by both humans and machines. For example, we could chose binary as the representation, but it has a rigid format and it was not designed for humans to...