Book Image

Practical Data Quality

By : Robert Hawker
Book Image

Practical Data Quality

By: Robert Hawker

Overview of this book

Poor data quality can lead to increased costs, hinder revenue growth, compromise decision-making, and introduce risk into organizations. This leads to employees, customers, and suppliers finding every interaction with the organization frustrating. Practical Data Quality provides a comprehensive view of managing data quality within your organization, covering everything from business cases through to embedding improvements that you make to the organization permanently. Each chapter explains a key element of data quality management, from linking strategy and data together to profiling and designing business rules which reveal bad data. The book outlines a suite of tried-and-tested reports that highlight bad data and allow you to develop a plan to make corrections. Throughout the book, you’ll work with real-world examples and utilize re-usable templates to accelerate your initiatives. By the end of this book, you’ll have gained a clear understanding of every stage of a data quality initiative and be able to drive tangible results for your organization at pace.
Table of Contents (16 chapters)
Part 1 – Getting Started
Part 2 – Understanding and Monitoring the Data That Matters
Part 3 – Improving Data Quality for the Long Term

Managing inactive and duplicate data

One key aspect of data quality not mentioned in this chapter so far is the management of inactive and duplicate records. The best organizations from a data governance perspective have a clear policy to identify and remove records that are no longer actively being used for transactions in the organization or are potentially duplicated.

However, in reality, these organizations represent just the top few percent. Most organizations are not good at this or are only good at this where they see the greatest risk. For example, a business in a heavily regulated industry might archive production records as soon as they can according to regulations to avoid future inspections identifying flaws originating before the regulatory period.

Managing duplicate and inactive data is a critical part of data quality management. I will explain how managing this properly can reduce the workload of remediation and avoid focusing on old, unused data.

Managing inactive...