Learning Apache Kafka - Second Edition

Learning Apache Kafka - Second Edition

By : Nishant Garg

Buy this Book

Learning Apache Kafka - Second Edition

By: Nishant Garg

Buy this Book

Overview of this book

<p>Kafka is one of those systems that is very simple to describe at a high level but has an incredible depth of technical detail when you dig deeper.</p> <p>Learning Apache Kafka Second Edition provides you with step-by-step, practical examples that help you take advantage of the real power of Kafka and handle hundreds of megabytes of messages per second from multiple clients. This book teaches you everything you need to know, right from setting up Kafka clusters to understanding basic blocks like producer, broker, and consumer blocks. Once you are all set up, you will then explore additional settings and configuration changes to achieve ever more complex goals. You will also learn how Kafka is designed internally and what configurations make it more effective. Finally, you will learn how Kafka works with other tools such as Hadoop, Storm, and so on.</p>

Learning Apache Kafka Second Edition

Credits

About the Author

About the Reviewers

www.PacktPub.com

Preface

Free Chapter

Introducing Kafka

Welcome to the world of Apache Kafka

Why do we need Kafka?

Kafka use cases

Installing Kafka

Summary

Setting Up a Kafka Cluster

A single node – a single broker cluster

A single node – multiple broker clusters

Multiple nodes – multiple broker clusters

The Kafka broker property list

Summary

Kafka Design

Kafka design fundamentals

Log compaction

Message compression in Kafka

Replication in Kafka

Summary

Writing Producers

The Java producer API

Simple Java producers

Creating a Java producer with custom partitioning

The Kafka producer property list

Summary

Writing Consumers

Kafka consumer APIs

Simple Java consumers

Reading messages from a topic and printing them

Multithreaded Java consumers

The Kafka consumer property list

Summary

Kafka Integrations

Kafka integration with Storm

Introducing Storm

Kafka integration with Hadoop

Summary

Operationalizing Kafka

Kafka administration tools

Kafka cluster mirroring

Integration with other tools

Summary

Index

Customer Reviews

5 star

4 star

3 star

2 star

1 star

Integration with other tools

This section discusses the contributions by many contributors providing integration with Apache Kafka for various needs such as logging, packaging, cloud integration, and Hadoop integration.

Camus (https://github.com/linkedin/camus) which provides a pipeline from Kafka to HDFS. Under this project, a single MapReduce job performs the following steps to load data to HDFS in a distributed manner:

As a first step, it discovers the latest topics and partition offsets from ZooKeeper.
Each task in the MapReduce job fetches events from the Kafka broker and commits the pulled data along with the audit count to the output folders.
After the completion of the job, final offsets are written to HDFS and can be further consumed by subsequent MapReduce jobs.
Information about the consumed messages is also updated in the Kafka cluster.

Some other useful contributions are:

Automated deployment and configuration of Kafka and ZooKeeper on Amazon (https://github.com/nathanmarz/kafka-deploy...

Learning Apache Kafka - Second Edition

By : Nishant Garg

Learning Apache Kafka - Second Edition

By: Nishant Garg

Overview of this book

Related Content you might be interested in

Current Title:

Learning Apache Kafka - Second Edition

Integration with other tools