In this chapter, we are dealing with a technology that constitutes one of the core layers of Data Lake, namely Data Ingestion Layer. For dealing with processing of data from both streaming and batch data from different applications in an enterprise having the layer is very important.
The technology that we have shortlisted to do this very important job of processing data is Apache Flink. I have to say that this selection was quite difficult as we have another technology in mind, namely Apache Spark, which was really strong in this area and more matured. But we decided to go with Flink in the end considering its pros. However, we have also detailed Spark a bit as opposed to other chapters in which we have just named other options and left it, because of its significance in this space.
This chapter will take you through the Data Ingestion Layer and its working first and then it will dive deep into the technology, Flink.