Big data systems are built in accordance with the data life cycle model, which can be broadly categorized in the following stages:
- Data discovery
- Data quality
- Ingesting data into the system
- Persisting the data in storage
- Analytics on the data
- Data governance
- Visualizing the results
We will study them in detail next.
Data discovery, like in the traditional process, ingests raw data from multiple source systems; however, the data will be divergent in volume, variety, and velocity when it comes to transforming it into business insights. Leveraging the power of big data, the data discovery process enables data wrangling and data enrichment facilitates combining datasets to recreate new perspectives and interactive visual analytics. An interactive data catalog facilitates guided search capabilities and enables us to thoroughly analyze and understand the data quality. A matured and robust data discovery process ensures possible data correlations...