Book Image

Driving Data Quality with Data Contracts

By : Andrew Jones
Book Image

Driving Data Quality with Data Contracts

By: Andrew Jones

Overview of this book

Despite the passage of time and the evolution of technology and architecture, the challenges we face in building data platforms persist. Our data often remains unreliable, lacks trust, and fails to deliver the promised value. With Driving Data Quality with Data Contracts, you’ll discover the potential of data contracts to transform how you build your data platforms, finally overcoming these enduring problems. You’ll learn how establishing contracts as the interface allows you to explicitly assign responsibility and accountability of the data to those who know it best—the data generators—and give them the autonomy to generate and manage data as required. The book will show you how data contracts ensure that consumers get quality data with clearly defined expectations, enabling them to build on that data with confidence to deliver valuable analytics, performant ML models, and trusted data-driven products. By the end of this book, you’ll have gained a comprehensive understanding of how data contracts can revolutionize your organization’s data culture and provide a competitive advantage by unlocking the real value within your data.
Table of Contents (16 chapters)
1
Part 1: Why Data Contracts?
4
Part 2: Driving Data Culture Change with Data Contracts
8
Part 3: Designing and Implementing a Data Architecture Based on Data Contracts

Designing a data contract

We’ll start by looking at how to design a data contract. This can be broken down into four steps:

  1. Identifying the purpose.
  2. Considering the trade-offs.
  3. Defining the data contract.
  4. Deploying the data contract.

However, designing a data contract is an iterative process, and you may need to revisit and refine these steps as more information is gathered through discussions between the data generators and data consumers.

With that in mind, let’s look at each step in turn.

Identifying the purpose

The first step is to identify the purpose of this data product for which you are defining a data contract. Who is this data for, and how will they use it? What problems will it solve? What business value will it drive?

Answering these questions will naturally start a discussion between the data generators and the data consumers, and as we’ve been discussing throughout this book, bringing these groups of people together...