Understanding the bronze layer
Inside a lakehouse, the bronze layer stores raw data exactly in the same shape, form, and format as it was collected from the data sources. The following is a list of some of the features of the data within the bronze layer:
- Unclean and non-standardized: This is deemed unsuitable for consumption by analytical workloads.
- Support for multiple formats and types: Data in the bronze layer might be structured, semi-structured, or unstructured. It can also be a combination of text and binary types.
- Immutable: By definition, data in the bronze layer should not be editable. If data changes over time, it is stored as duplicate copies.
- Stored forever: Data in the bronze layer is never deleted. This is less of a concern due to the low cost of storage. However, to save costs, some portions of data might be archived.
Having data in the native format offers several advantages, as follows:
- Replayed: Often, analysts and data scientists...