Book Image

Apache Kafka 1.0 Cookbook

By : Alexey Zinoviev, Raúl Estrada
Book Image

Apache Kafka 1.0 Cookbook

By: Alexey Zinoviev, Raúl Estrada

Overview of this book

Apache Kafka provides a unified, high-throughput, low-latency platform to handle real-time data feeds. This book will show you how to use Kafka efficiently, and contains practical solutions to the common problems that developers and administrators usually face while working with it. This practical guide contains easy-to-follow recipes to help you set up, configure, and use Apache Kafka in the best possible manner. You will use Apache Kafka Consumers and Producers to build effective real-time streaming applications. The book covers the recently released Kafka version 1.0, the Confluent Platform and Kafka Streams. The programming aspect covered in the book will teach you how to perform important tasks such as message validation, enrichment and composition.Recipes focusing on optimizing the performance of your Kafka cluster, and integrate Kafka with a variety of third-party tools such as Apache Hadoop, Apache Spark, and Elasticsearch will help ease your day to day collaboration with Kafka greatly. Finally, we cover tasks related to monitoring and securing your Apache Kafka cluster using tools such as Ganglia and Graphite. If you're looking to become the go-to person in your organization when it comes to working with Apache Kafka, this book is the only resource you need to have.
Table of Contents (18 chapters)
Title Page
Credits
About the Author
About the Reviewers
www.PacktPub.com
Customer Feedback
Dedication
Preface

Configuring the ZooKeeper settings


Apache Zookeeper is a centralized service for maintaining configuration information providing distributed synchronization. ZooKeeper is used in Kafka for cluster management and to maintain the topics information synchronized.

Getting ready

With your favorite text editor, open your server.properties file copy.

How to do it…

Adjust the following parameters:

zookeeper.connect=127.0.0.1:2181,192.168.0.32:2181
zookeeper.session.timeout.ms=6000
zookeeper.connection.timeout.ms=6000
zookeeper.sync.time.ms=2000

How it works…

Here is an explanation of these settings:

  • zookeeper.connect: Default value: null. This is a comma-separated value in the form of the hostname:port string, indicating the Zookeeper connection. Specifying several connections ensures the Kafka cluster reliability and continuity. When one node fails, Zookeeper uses the chroot path (/chroot/path) to make the data available under that particular path. This enables having the Zookeeper cluster available for multiple Kafka clusters. This path must be created before starting the Kafka cluster, and consumers must use the same string.
  • zookeeper.session.timeout.ms: Default value: 6000. Session timeout means that if in this time period a heartbeat from the server is not received, it is considered dead. This parameter is fundamental, since if it is long and if the server is dead the whole system will experience problems. If it is small, a living server could be considered dead.
  • zookeeper.connection.timeout.ms: Default value: 6000. This is the maximum time that the client will wait while establishing a connection to Zookeeper.
  • zookeeper.sync.time.ms: Default value: 2000. This is the time a Zookeeper follower can be behind its Zookeeper leader.

See also