To perform any operations in analytics systems, one should perform the following steps:
Capture the data.
Process the information.
Generate actionable reports or output.
The first step in the process is capturing raw transactional data points. Traditionally, the data is available in the form of plain text log files, which are usually analyzed and processed in one go. Even though it is easier for us to access and process information directly from a log file, it isn't advisable to do so. Before analyzing the data, we need to process the information from a log file into another medium, creating an additional step. Furthermore, an inherent problem with log processing is that it is difficult to do real-time analysis over the data as they are batch processed.
Let's assume that we are developing an online deals site such as http://slickdeals.net, which provides the best deals available online. We have hundreds of live deals on the site. The objective...