There are many technological scenarios, and some of them are similar in pattern. It is a good idea to map scenarios with architectural patterns. Once these patterns, are understood, they become the fundamental building blocks of solutions. We will discuss five types of patterns in the following section.
Note
This solution is not always optimized, and it may depend on domain data, type of data, or some other factors. These examples are to visualize a problem and they can help to find a solution.
Big data systems can be used as a storage pattern or as a data warehouse, where data from multiple sources, even with different types of data, can be stored and can be utilized later. The usage scenario and use case are as follows:
Usage scenario:
Data getting continuously generated in large volumes
Need for preprocessing before getting loaded into the target system
Use case:
Machine data capture for subsequent cleansing can be merged in multiple or single big file(s) and can be loaded in a Hadoop to compute
Unstructured data across multiple sources should be captured for subsequent analysis on emerging patterns
Data loaded in Hadoop should be processed and filtered, and depending on the data, we can have the storage as a data warehouse, Hadoop, or any NoSQL system.
The storage pattern is shown in the following figure:
Big data systems can be designed to perform transformation as the data loading and cleansing activity, and many transformations can be done faster than traditional systems due to parallelism. Transformation is one phase in the Extract–Transform–Load of data ingestion and cleansing phase. The usage scenario and use case are as follows:
Usage scenario
A large volume of raw data to be preprocessed
Data type includes structured as well as non-structured data
Use case
Evolution of ETL (Extract–Transform–Load) tools to leverage big data, for example, Pentaho, Talend, and so on. Also, in Hadoop, ELT (Extract–Load–Transform) is also trending, as the loading will be faster in Hadoop, and cleansing can run a parallel process to clean and transform the input, which will be faster
The data transformation pattern is shown in the following figure:
Data analytics is of wider interest in big data systems, where a huge amount of data can be analyzed to generate statistical reports and insights about the data, which can be useful in business and understanding of patterns. The usage scenario and use case are as follows:
Usage scenario
Improved response time for detection of patterns
Data analysis for non-structured data
Use case
Fast turnaround for machine data analysis (for example, analysis of seismic data)
Pattern detection across structured and non-structured data (for example, fraud analysis)
Big data systems integrating with some streaming libraries and systems are capable of handling high scale real-time data processing. Real-time processing for a large and complex requirement possesses a lot of challenges such as performance, scalability, availability, resource management, low latency, and so on. Some streaming technologies such as Storm and Spark Streaming can be integrated with YARN. The usage scenario and use case are as follows:
Usage scenario
Managing the action to be taken based on continuously changing data in real time
Use case
The data in a real-time pattern is shown in the following figure:
Big data systems can be tuned as a special case for a low latency system, where reads are much higher and updates are low, which can fetch the data faster and can be stored in memory, which can further improve the performance and avoid overheads. The usage scenario and use case are as follows:
Usage scenario
Reads are far higher in ratio to writes
Reads require very low latency and a guaranteed response
Distributed location-based data caching
Use case
Order promising solutions
Cloud-based identity and SSO
Low latency real-time personalized offers on mobile
The low latency caching pattern is shown in the following pattern:
Some of the technology stacks that are widely used according to the layer and framework are shown in the following image: