Kafka is a distributed, partitioned, and replicated commit log service. In simple words, it is a distributed messaging server. Kafka maintains the message feed in categories called topics. An example of the topic can be a ticker symbol of a company you would like to get news about, for example, CSCO for Cisco.
Processes that produce messages are called producers and those that consume messages are called consumers. In traditional messaging, the messaging service has one central messaging server, also called broker. Since Kafka is a distributed messaging service, it has a cluster of brokers, which functionally act as one Kafka broker, as shown here:
For each topic, Kafka maintains the partitioned log. This partitioned log consists of one or more partitions spread across the cluster, as shown in the following figure:
Kafka borrows a lot of concepts from Hadoop and other big data frameworks. The concept of partition is very similar to the concept of InputSplit
in Hadoop...