Book Image

Implementing Cloud Design Patterns for AWS - Second Edition

By : Sean Keery, Clive Harber, Marcus Young
Book Image

Implementing Cloud Design Patterns for AWS - Second Edition

By: Sean Keery, Clive Harber, Marcus Young

Overview of this book

Whether you're just getting your feet wet in cloud infrastructure or already creating complex systems, this book will guide you through using the patterns to fit your system needs. Starting with patterns that cover basic processes such as source control and infrastructure-as-code, the book goes on to introduce cloud security practices. You'll then cover patterns of availability and scalability and get acquainted with the ephemeral nature of cloud environments. You'll also explore advanced DevOps patterns in operations and maintenance, before focusing on virtualization patterns such as containerization and serverless computing. In the final leg of your journey, this book will delve into data persistence and visualization patterns. You'll get to grips with architectures for processing static and dynamic data, as well as practices for managing streaming data. By the end of this book, you will be able to design applications that are tolerant of underlying hardware failures, resilient against an unexpected influx of data, and easy to manage and replicate.
Table of Contents (20 chapters)
Title Page
Dedication
About Packt
Contributors
Preface
Free Chapter
1
Introduction to Amazon Web Services
Index

Event stream processing


Event stream processing can happen in a number of ways on AWS. One of the ways—Kinesis—we will cover separately. We will discuss using Elastic MapReduce here.

Elastic MapReduce is capable of using several different types of processing engines: Hadoop, Apache Spark, HBase, Presto, and Apache Flink. All of these engines are used for processing large amounts of data: some, such as Spark and Flink, are for real-time data processing, and Hadoop is for batch processing and so much more.

Standard EMR has already been covered in a previous chapter.

Using Flink (or another engine) via Terraform is as easy as changing the application's array content, as shown in the following block:

resource "aws_emr_cluster" "cluster" {
  name = "emr-test-arn"
  release_label = "emr-4.6.0"
  applications = ["Flink"]
  additional_info = <<EOF
{
  "instanceAwsClientConfiguration": {
    "proxyPort": 8099,
    "proxyHost": "myproxy.example.com"
  }
}
EOF

Next, set the correct configuration details...