These days, it is very common to have a central repository of application log events in many enterprises. Also, the log events are streamed live to data processing applications in order to monitor the performance of the running applications on a real-time basis so that timely remediation measures can be taken. Such a use case is discussed here to demonstrate the real-time processing of log events using a Spark Streaming data processing application. In this use case, the live application log events are written to a TCP socket. The Spark Streaming data processing application constantly listens to a given port on a given host to collect the stream of log events.
The Netcat utility that comes with most UNIX installations is used here as the data server. To make sure that Netcat is installed in the system, type the manual command as given in the following scripts, and, after coming out of it, run it and make sure that there is no error...