According to the definition, Apache Beam is an open source unified programming model to define and execute data processing pipelines, including ETL, batch, and stream processing. This recipe shows how to read Kafka with Apache Beam.
To install Apache Beam, follow the instructions at: https://beam.apache.org/get-started/quickstart-py/.
The following code shows how to write a Beam pipeline to read from Kafka. The example illustrates various options for configuring the Beam source:
pipeline .apply(KafkaIO.read() .withBootstrapServers("broker_1:9092,broker_2:9092") .withTopics(ImmutableList.of("topic_1", "topic_2")) .withKeyCoder(BigEndianLongCoder.of()) .withValueCoder(StringUtf8Coder.of()) .updateConsumerProperties( ImmutableMap.of("receive.buffer.bytes", 1024 * 1024)) .withTimestampFn(new CustomTypestampFunction()) .withWatermarkFn...