-
Book Overview & Buying
-
Table Of Contents
Building Modern Data Applications Using Databricks Lakehouse
By :
In this chapter, we covered a lot of topics surrounding the data quality of the data in our lakehouse. We learned how the integrity of a table can be enforced using NOT NULL and CHECK constraints in Delta Lake. We also defined relationships between the tables in our lakehouse using PRIMARY KEY and FOREIGN KEY constraints. Next, we saw how we could enforce primary key uniqueness across our Delta tables using views to validate the data in our tables. We also saw just how easy it was to update the behavior of our data pipeline when incoming rows violated data quality constraints, allowing data engineering teams to react to downstream processes that have the potential to break from poor-quality data. Finally, we saw a practical example of how we can use expectations to create a conditional data flow in our pipeline, allowing our data stewards to quarantine and correct data that doesn’t meet the expected data quality.
In the next chapter, we’re going to get into...