Book Image

Scalable Data Architecture with Java

By : Sinchan Banerjee
Book Image

Scalable Data Architecture with Java

By: Sinchan Banerjee

Overview of this book

Java architectural patterns and tools help architects to build reliable, scalable, and secure data engineering solutions that collect, manipulate, and publish data. This book will help you make the most of the architecting data solutions available with clear and actionable advice from an expert. You’ll start with an overview of data architecture, exploring responsibilities of a Java data architect, and learning about various data formats, data storage, databases, and data application platforms as well as how to choose them. Next, you’ll understand how to architect a batch and real-time data processing pipeline. You’ll also get to grips with the various Java data processing patterns, before progressing to data security and governance. The later chapters will show you how to publish Data as a Service and how you can architect it. Finally, you’ll focus on how to evaluate and recommend an architecture by developing performance benchmarks, estimations, and various decision metrics. By the end of this book, you’ll be able to successfully orchestrate data architecture solutions using Java and related technologies as well as to evaluate and present the most suitable solution to your clients.
Table of Contents (19 chapters)
1
Section 1 – Foundation of Data Systems
5
Section 2 – Building Data Processing Pipelines
11
Section 3 – Enabling Data as a Service
14
Section 4 – Choosing Suitable Data Architecture

Index

As this ebook edition doesn't have fixed pagination, the page numbers below are hyperlinked for reference only, based on the printed edition of this book.

A

Access Control List (ACL) 232

agents

Docker agent 65

Kubelet agent 65

Aggregate Report 313

Amazon Elastic Compute Cloud (EC2) 75

Amazon Elastic Container Service (ECS) 75

Amazon Elastic Kubernetes Service (EKS) 75

Amazon Resource Name (ARN) 252

Apache Hadoop 69

Apache Kafka

setting up, on local machine 154, 155

Apache Mesos 71

Apache Spark 71

Apache Superset

about 12

URL 13

Apache YARN 71

API management

about 263

benefits 263, 264

enabling, with AWS API Gateway 264-273

architectural decision matrix

creating 337

architecture

consideration, for selecting storage 122, 123

cost factor, in processing layer 125-127

solution, building 122

storage based on cost, determining 123-125

artificial intelligence (AI) 75...