Book Image

Scalable Data Streaming with Amazon Kinesis

By : Tarik Makota, Brian Maguire, Danny Gagne, Rajeev Chakrabarti
Book Image

Scalable Data Streaming with Amazon Kinesis

By: Tarik Makota, Brian Maguire, Danny Gagne, Rajeev Chakrabarti

Overview of this book

Amazon Kinesis is a collection of secure, serverless, durable, and highly available purpose-built data streaming services. This data streaming service provides APIs and client SDKs that enable you to produce and consume data at scale. Scalable Data Streaming with Amazon Kinesis begins with a quick overview of the core concepts of data streams, along with the essentials of the AWS Kinesis landscape. You'll then explore the requirements of the use case shown through the book to help you get started and cover the key pain points encountered in the data stream life cycle. As you advance, you'll get to grips with the architectural components of Kinesis, understand how they are configured to build data pipelines, and delve into the applications that connect to them for consumption and processing. You'll also build a Kinesis data pipeline from scratch and learn how to implement and apply practical solutions. Moving on, you'll learn how to configure Kinesis on a cloud platform. Finally, you’ll learn how other AWS services can be integrated into Kinesis. These services include Redshift, Dynamo Database, AWS S3, Elastic Search, and third-party applications such as Splunk. By the end of this AWS book, you’ll be able to build and deploy your own Kinesis data pipelines with Kinesis Data Streams (KDS), Kinesis Data Firehose (KFH), Kinesis Video Streams (KVS), and Kinesis Data Analytics (KDA).
Table of Contents (13 chapters)
1
Section 1: Introduction to Data Streaming and Amazon Kinesis
5
Section 2: Deep Dive into Kinesis
10
Section 3: Integrations

Creating operational insights using Apache Flink

Amazon Kinesis Data Analytics for Apache Flink allows us to go beyond SQL and use Java or Scala as programming languages and a data stream API to build our analytics applications. In this section, we are going to focus on KDA for Flink.

Note

If you are not familiar with Apache Flink, we recommend you first go through the Flink overview: https://ci.apache.org/projects/flink/flink-docs-release-1.11/learn-flink/.

Apache Flink deserves a book in itself, and we are going to cover how to run Flink applications on KDA specifically.

When we create applications with KDA for Flink, we follow the same pattern as we did with KDA SQL, with a number of differences outlined in the following table:

Figure 6.9 – KDA SQL and KDA Flink comparison

KDA Flink applications come with more options and flexibility, which can be a determining factor in selecting which engine to use. For example, it is common for companies...