Learning Apache Kafka - Second Edition

Learning Apache Kafka - Second Edition

By : Nishant Garg

Buy this Book

Learning Apache Kafka - Second Edition

By: Nishant Garg

Buy this Book

Overview of this book

<p>Kafka is one of those systems that is very simple to describe at a high level but has an incredible depth of technical detail when you dig deeper.</p> <p>Learning Apache Kafka Second Edition provides you with step-by-step, practical examples that help you take advantage of the real power of Kafka and handle hundreds of megabytes of messages per second from multiple clients. This book teaches you everything you need to know, right from setting up Kafka clusters to understanding basic blocks like producer, broker, and consumer blocks. Once you are all set up, you will then explore additional settings and configuration changes to achieve ever more complex goals. You will also learn how Kafka is designed internally and what configurations make it more effective. Finally, you will learn how Kafka works with other tools such as Hadoop, Storm, and so on.</p>

Learning Apache Kafka Second Edition

Credits

About the Author

About the Reviewers

www.PacktPub.com

Preface

Free Chapter

Introducing Kafka

Welcome to the world of Apache Kafka

Why do we need Kafka?

Kafka use cases

Installing Kafka

Summary

Setting Up a Kafka Cluster

A single node – a single broker cluster

A single node – multiple broker clusters

Multiple nodes – multiple broker clusters

The Kafka broker property list

Summary

Kafka Design

Kafka design fundamentals

Log compaction

Message compression in Kafka

Replication in Kafka

Summary

Writing Producers

The Java producer API

Simple Java producers

Creating a Java producer with custom partitioning

The Kafka producer property list

Summary

Writing Consumers

Kafka consumer APIs

Simple Java consumers

Reading messages from a topic and printing them

Multithreaded Java consumers

The Kafka consumer property list

Summary

Kafka Integrations

Kafka integration with Storm

Introducing Storm

Kafka integration with Hadoop

Summary

Operationalizing Kafka

Kafka administration tools

Kafka cluster mirroring

Integration with other tools

Summary

Index

Customer Reviews

5 star

4 star

3 star

2 star

1 star

Preface

This book is here to help you get familiar with Apache Kafka and to solve your challenges related to the consumption of millions of messages in publisher-subscriber architectures. It is aimed at getting you started programming with Kafka so that you will have a solid foundation to dive deep into different types of implementations and integrations for Kafka producers and consumers.

In addition to an explanation of Apache Kafka, we also spend a chapter exploring Kafka integration with other technologies such as Apache Hadoop and Apache Storm. Our goal is to give you an understanding not just of what Apache Kafka is, but also how to use it as a part of your broader technical infrastructure. In the end, we will walk you through operationalizing Kafka where we will also talk about administration.

What this book covers

Chapter 1, Introducing Kafka, discusses how organizations are realizing the real value of data and evolving the mechanism of collecting and processing it. It also describes how to install and build Kafka 0.8.x using different versions of Scala.

Chapter 2, Setting Up a Kafka Cluster, describes the steps required to set up a single- or multi-broker Kafka cluster and shares the Kafka broker properties list.

Chapter 3, Kafka Design, discusses the design concepts used to build the solid foundation for Kafka. It also talks about how Kafka handles message compression and replication in detail.

Chapter 4, Writing Producers, provides detailed information about how to write basic producers and some advanced level Java producers that use message partitioning.

Chapter 5, Writing Consumers, provides detailed information about how to write basic consumers and some advanced level Java consumers that consume messages from the partitions.

Chapter 6, Kafka Integrations, provides a short introduction to both Storm and Hadoop and discusses how Kafka integration works for both Storm and Hadoop to address real-time and batch processing needs.

Chapter 7, Operationalizing Kafka, describes information about the Kafka tools required for cluster administration and cluster mirroring and also shares information about how to integrate Kafka with Camus, Apache Camel, Amazon Cloud, and so on.

What you need for this book

In the simplest case, a single Linux-based (CentOS 6.x) machine with JDK 1.6 installed will give a platform to explore almost all the exercises in this book. We assume you are familiar with command line Linux, so any modern distribution will suffice.

Some of the examples need multiple machines to see things working, so you will require access to at least three such hosts; virtual machines are fine for learning and exploration.

As we also discuss the big data technologies such as Hadoop and Storm, you will generally need a place to run your Hadoop and Storm clusters.

Who this book is for

This book is for those who want to know about Apache Kafka at a hands-on level; the key audience is those with software development experience but no prior exposure to Apache Kafka or similar technologies.

This book is also for enterprise application developers and big data enthusiasts who have worked with other publisher-subscriber-based systems and now want to explore Apache Kafka as a futuristic scalable solution.

Conventions

In this book, you will find a number of styles of text that distinguish between different kinds of information. Here are some examples of these styles, and an explanation of their meaning.

Code words in text are shown as follows: "Download the jdk-7u67-linux-x64.rpm release from Oracle's website."

A block of code is set as follows:

String messageStr = new String("Hello from Java Producer");
KeyedMessage<Integer, String> data = new KeyedMessage<Integer, String>(topic, messageStr);
producer.send(data);

When we wish to draw your attention to a particular part of a code block, the relevant lines or items are set in bold:

Properties props = new Properties();
props.put("metadata.broker.list","localhost:9092");
props.put("serializer.class","kafka.serializer.StringEncoder");
props.put("request.required.acks", "1");
ProducerConfig config = new ProducerConfig(props); 
Producer<Integer, String> producer = new Producer<Integer, 
    String>(config);

Any command line input or output is written as follows:

[root@localhost kafka-0.8]# java SimpleProducer kafkatopic Hello_There

New terms and important words are shown in bold.

Note

Warnings or important notes appear in a box like this.

Tip

Tips and tricks appear like this.

Reader feedback

Feedback from our readers is always welcome. Let us know what you think about this book—what you liked or may have disliked. Reader feedback is important for us to develop titles that you really get the most out of.

To send us general feedback, simply send an e-mail to <[email protected]>, and mention the book title via the subject of your message.

If there is a topic that you have expertise in and you are interested in either writing or contributing to a book, see our author guide on www.packtpub.com/authors.

Customer support

Now that you are the proud owner of a Packt book, we have a number of things to help you to get the most from your purchase.

Errata

Although we have taken every care to ensure the accuracy of our content, mistakes do happen. If you find a mistake in one of our books—maybe a mistake in the text or the code—we would be grateful if you would report this to us. By doing so, you can save other readers from frustration and help us improve subsequent versions of this book. If you find any errata, please report them by visiting http://www.packtpub.com/submit-errata, selecting your book, clicking on the errata submission form link, and entering the details of your errata. Once your errata are verified, your submission will be accepted and the errata will be uploaded on our website, or added to any list of existing errata, under the Errata section of that title. Any existing errata can be viewed by selecting your title from http://www.packtpub.com/support.

Piracy

Piracy of copyright material on the Internet is an ongoing problem across all media. At Packt, we take the protection of our copyright and licenses very seriously. If you come across any illegal copies of our works, in any form, on the Internet, please provide us with the location address or website name immediately so that we can pursue a remedy.

Please contact us at <[email protected]> with a link to the suspected pirated material.

We appreciate your help in protecting our authors, and our ability to bring you valuable content.

Questions

You can contact us at <[email protected]> if you are having a problem with any aspect of the book, and we will do our best to address it.

Learning Apache Kafka - Second Edition

By : Nishant Garg

Learning Apache Kafka - Second Edition

By: Nishant Garg

Overview of this book

Related Content you might be interested in

Current Title:

Learning Apache Kafka - Second Edition

Preface

What this book covers

What you need for this book

Who this book is for

Conventions

Note

Tip

Reader feedback

Customer support

Errata

Piracy

Questions