Book Image

Scalable Data Architecture with Java

By : Sinchan Banerjee
Book Image

Scalable Data Architecture with Java

By: Sinchan Banerjee

Overview of this book

Java architectural patterns and tools help architects to build reliable, scalable, and secure data engineering solutions that collect, manipulate, and publish data. This book will help you make the most of the architecting data solutions available with clear and actionable advice from an expert. You’ll start with an overview of data architecture, exploring responsibilities of a Java data architect, and learning about various data formats, data storage, databases, and data application platforms as well as how to choose them. Next, you’ll understand how to architect a batch and real-time data processing pipeline. You’ll also get to grips with the various Java data processing patterns, before progressing to data security and governance. The later chapters will show you how to publish Data as a Service and how you can architect it. Finally, you’ll focus on how to evaluate and recommend an architecture by developing performance benchmarks, estimations, and various decision metrics. By the end of this book, you’ll be able to successfully orchestrate data architecture solutions using Java and related technologies as well as to evaluate and present the most suitable solution to your clients.
Table of Contents (19 chapters)
1
Section 1 – Foundation of Data Systems
5
Section 2 – Building Data Processing Pipelines
11
Section 3 – Enabling Data as a Service
14
Section 4 – Choosing Suitable Data Architecture

Data-driven architectural decisions to mitigate risk

A decision matrix helps us evaluate the desirability of an architecture. However, it is not always necessary to opt for the architectural option that has the highest desirability score. Sometimes, each criterion needs to have a minimum threshold score for an architecture to be selected. Such scenarios can be handled by a spider chart.

A spider chart, also known as a radar chart, is often used to display data across multiple dimensions. Each dimension is represented by an axis. Usually, the dimensions are quantitative and normalized to match a particular range. Then, each option is plotted against all the dimensions to create a closed polygon structure, as shown in the following diagram:

Figure 12.6 – Spider or radar chart

In our case, each criterion for making an architectural decision can be considered a dimension. Also, each architectural alternative is plotted as a graph on the radar chart. Let...