Book Image

Driving Data Quality with Data Contracts

By : Andrew Jones
Book Image

Driving Data Quality with Data Contracts

By: Andrew Jones

Overview of this book

Despite the passage of time and the evolution of technology and architecture, the challenges we face in building data platforms persist. Our data often remains unreliable, lacks trust, and fails to deliver the promised value. With Driving Data Quality with Data Contracts, you’ll discover the potential of data contracts to transform how you build your data platforms, finally overcoming these enduring problems. You’ll learn how establishing contracts as the interface allows you to explicitly assign responsibility and accountability of the data to those who know it best—the data generators—and give them the autonomy to generate and manage data as required. The book will show you how data contracts ensure that consumers get quality data with clearly defined expectations, enabling them to build on that data with confidence to deliver valuable analytics, performant ML models, and trusted data-driven products. By the end of this book, you’ll have gained a comprehensive understanding of how data contracts can revolutionize your organization’s data culture and provide a competitive advantage by unlocking the real value within your data.
Table of Contents (16 chapters)
1
Part 1: Why Data Contracts?
4
Part 2: Driving Data Culture Change with Data Contracts
8
Part 3: Designing and Implementing a Data Architecture Based on Data Contracts

Further reading

For more information on the topics covered in this chapter, please see the following resources:

  • From Data Warehouse to Data Lakehouse: The Evolution of Data Analytics Platforms by Henry Golas: https://cloudian.com/blog/from-data-warehouse-to-data-lakehouse-the-evolution-of-data-analytics-platforms/
  • The Rise of ELT for DW Data Integration by Chris Tabb: https://www.leit-data.com/the-rise-of-elt-for-dw-data-integration/
  • The Modern Data Stack: Past, Present, and Future by Tristan Handy: https://www.getdbt.com/blog/future-of-the-modern-data-stack/
  • DBLog: A Generic Change-Data-Capture Framework: https://netflixtechblog.com/dblog-a-generic-change-data-capture-framework-69351fb9099b
  • Capturing Data Evolution in a Service-Oriented Architecture by Jad Abi-Samra on the Airbnb Tech Blog: https://medium.com/airbnb-engineering/capturing-data-evolution-in-a-service-oriented-architecture-72f7c643ee6f
  • Data Systems Tend Towards Production by Ian Macomber: https://ian-macomber.medium.com/data-systems-tend-towards-production-be5a86f65561
  • How Netflix used big data and analytics to generate billions by Michael Dixon: https://seleritysas.com/blog/2019/04/05/how-netflix-used-big-data-and-analytics-to-generate-billions/
  • How Uber uses data science to reinvent transportation? ProjectPro: https://www.projectpro.io/article/how-uber-uses-data-science-to-reinvent-transportation/290
  • How DoorDash built the greatest go-to-market playbook ever by Lars Kamp: https://findingdistribution.substack.com/p/how-doordash-built-the-greatest-go
  • Why Retailers Fail to Adopt Advanced Data Analytics by Nicole DeHoratius, Andrés Musalem, and Robert Rooderkerk: https://hbr.org/2023/02/why-retailers-fail-to-adopt-advanced-data-analytics
  • Companies are losing revenue opportunities and customers because of bad data practices by Bob Violino: https://www.zdnet.com/article/companies-are-losing-revenue-opportunities-and-customers-because-of-bad-data-practices/