Book Image

Apache Kafka 1.0 Cookbook

By : Alexey Zinoviev, Raúl Estrada
Book Image

Apache Kafka 1.0 Cookbook

By: Alexey Zinoviev, Raúl Estrada

Overview of this book

Apache Kafka provides a unified, high-throughput, low-latency platform to handle real-time data feeds. This book will show you how to use Kafka efficiently, and contains practical solutions to the common problems that developers and administrators usually face while working with it. This practical guide contains easy-to-follow recipes to help you set up, configure, and use Apache Kafka in the best possible manner. You will use Apache Kafka Consumers and Producers to build effective real-time streaming applications. The book covers the recently released Kafka version 1.0, the Confluent Platform and Kafka Streams. The programming aspect covered in the book will teach you how to perform important tasks such as message validation, enrichment and composition.Recipes focusing on optimizing the performance of your Kafka cluster, and integrate Kafka with a variety of third-party tools such as Apache Hadoop, Apache Spark, and Elasticsearch will help ease your day to day collaboration with Kafka greatly. Finally, we cover tasks related to monitoring and securing your Apache Kafka cluster using tools such as Ganglia and Graphite. If you're looking to become the go-to person in your organization when it comes to working with Apache Kafka, this book is the only resource you need to have.
Table of Contents (18 chapters)
Title Page
Credits
About the Author
About the Reviewers
www.PacktPub.com
Customer Feedback
Dedication
Preface

Configuring threads and performance


No parameter should be left by default when the optimal performance is desired. These parameters should be taken into consideration to achieve the best behavior.

Getting ready

With your favorite text editor, open your server.properties file copy.

How to do it...

Adjust the following parameters:

message.max.bytes=1000000
num.network.threads=3
num.io.threads=8
background.threads=10
queued.max.requests=500
socket.send.buffer.bytes=102400
socket.receive.buffer.bytes=102400
socket.request.max.bytes=104857600
num.partitions=1

How it works…

With these changes, the network and performance configurations have been set to achieve optimum levels for the application. Again, every system is different, and you might need to experiment a little to come up with the optimal one for a specific configuration.

Here is an explanation of every parameter:

  • message.max.bytes: Default value: 10 00 000. This is the maximum size, in bytes, for each message. This is designed to prevent any producer from sending extra large messages and saturating the consumers.
  • num.network.threads: Default value: 3. This is the number of simultaneous threads running to handle a network's request. If the system has too many simultaneous requests, consider increasing this value.
  • num.io.threads: Default value: 8. This is the number of threads for Input Output operations. This value should be at least the number of present processors.
  • background.threads: Default value: 10. This is the number of threads for background jobs. For example, old log files deletion.
  • queued.max.requests: Default value: 500. This is the number of messages queued while the other messages are processed by the I/O threads. Remember, when the queue is full, the network threads will not accept more requests. If your application has erratic loads, set this to a value at which it will not throttle.
  • socket.send.buffer.bytes: Default value: 102 400. This is SO_SNDBUFF buffer size, used for socket connections.
  • socket.receive.buffer.bytes: Default value: 102 400. This is SO_RCVBUFF buffer size, also used for socket connections.
  • socket.request.max.bytes: Default value: 104 857 600. This is the maximum request size, in bytes, that the server can accept. It should always be smaller than the Java heap size.
  • num.partitions: Default value: 1. This is the number of default partitions of a topic, without giving any partition size.

There's more...

As with everything that runs on the JVM, the Java installation should be tuned to achieve optimal performance. This includes the settings for heap, socket size, memory parameters, and garbage collection.