The 3 + 1 Vs and how they affect choice in data processing design
The biggest challenge in data processing is to actually understand the three Vs of the data intensive system and their effect on the overall approach to the design of the data processing pipeline.
The three Vs stand for velocity of the data, volume of the data, and variety of the data. The outcome from the data processing system is generally the fourth V of the equation, that is, the value of the data.
Some experts add more Vs into this equation. For example, data veracity (depicting abnormalities in the data), data validity (the data represents what it is intended to represent), and data volatility (expressing in loose terms the importance of data over a period of time). While they all are important, the author believes that these are more or less covered with the basic three Vs of data.
Since we are talking about data intensive systems, it's safe to assume that the volume of the data will be huge. Thus the data processing system...