Book Image

Principles of Data Fabric

By : Sonia Mezzetta
Book Image

Principles of Data Fabric

By: Sonia Mezzetta

Overview of this book

Data can be found everywhere, from cloud environments and relational and non-relational databases to data lakes, data warehouses, and data lakehouses. Data management practices can be standardized across the cloud, on-premises, and edge devices with Data Fabric, a powerful architecture that creates a unified view of data. This book will enable you to design a Data Fabric solution by addressing all the key aspects that need to be considered. The book begins by introducing you to Data Fabric architecture, why you need them, and how they relate to other strategic data management frameworks. You’ll then quickly progress to grasping the principles of DataOps, an operational model for Data Fabric architecture. The next set of chapters will show you how to combine Data Fabric with DataOps and Data Mesh and how they work together by making the most out of it. After that, you’ll discover how to design Data Integration, Data Governance, and Self-Service analytics architecture. The book ends with technical architecture to implement distributed data management and regulatory compliance, followed by industry best practices and principles. By the end of this data book, you will have a clear understanding of what Data Fabric is and what the architecture looks like, along with the level of effort that goes into designing a Data Fabric solution.
Table of Contents (16 chapters)
1
Part 1: The Building Blocks
4
Part 2: Complementary Data Management Approaches and Strategies
8
Part 3: Designing and Realizing Data Fabric Architecture

What is Data Fabric?

Data Fabric is a distributed and composable architecture that is metadata and event driven. It’s use case agnostic and excels in managing and governing distributed data. It integrates dispersed data with automation, strong Data Governance, protection, and security. Data Fabric focuses on the Self-Service delivery of governed data.

Data Fabric does not require the migration of data into a centralized data storage layer, nor to a specific data format or database type. It can support a diverse set of data management styles and use cases across industries, such as a 360-degree view of a customer, regulatory compliance, cloud migration, data democratization, and data analytics.

In the next section, we’ll touch on the characteristics of Data Fabric.

What Data Fabric is

Data Fabric is a composable architecture made up of different tools, technologies, and systems. It has an active metadata and event-driven design that automates Data Integration while achieving interoperability. Data Governance, Data Privacy, Data Protection, and Data Security are paramount to its design and to enable Self-Service data sharing. The following figure summarizes the different characteristics that constitute a Data Fabric design.

Figure 1.1 – Data Fabric characteristics

Figure 1.1 – Data Fabric characteristics

Data Fabric takes a proactive and intelligent approach to data management. It monitors and evaluates data operations to learn and suggest future improvements leading to productivity and prosperous decision-making. It approaches data management with flexibility, scalability, automation, and governance in mind and supports multiple data management styles. What distinguishes Data Fabric architecture from others is its inherent nature of embedding Data Governance into the data life cycle as part of its design by leveraging metadata as the foundation. Data Fabric focuses on business controls with an emphasis on robust and efficient data interoperability.

In the next section, we will clarify what is not representative of a Data Fabric design.

What Data Fabric is not

Let’s understand what Data Fabric is not:

  • It is not a single technology, such as data virtualization. While data virtualization is a key Data Integration technology in Data Fabric, the architecture supports several more technologies, such as data replication, ETL/ELT, and streaming.
  • It is not a single tool like a data catalog and it doesn’t have to be a single data storage system like a data warehouse. It represents a diverse set of tools, technologies, and storage systems that work together in a connected ecosystem via a distributed data architecture, with active metadata as the glue.
  • It doesn’t just support centralized data management but also federated and decentralized data management. It excels in connecting distributed data.
  • Data Fabric is not the same as Data Mesh. They are different data architectures that tackle the complexities of distributed data management using different but complementary approaches. We will cover this topic in more depth in Chapter 3, Choosing between Data Fabric and Data Mesh.

The following diagram summarizes what Data Fabric architecture does not constitute:

Figure 1.2 – What Data Fabric is not

Figure 1.2 – What Data Fabric is not

We have discussed in detail what defines Data Fabric and what does not. In the next section, we will discuss why Data Fabric is important.