Apache Spark is one of the most popular projects in the Hadoop ecosystem and possibly the most actively developed open source project in big data. Its simplicity, performance, and flexibility have made it popular not only among data scientists but also among engineers, developers, and everybody else interested in big data.
With its rising popularity, Duvvuri and Bikram have produced a book that is the need of the hour, Spark for Data Science, but with a difference. They have not only covered the Spark computing platform but have also included aspects of data science and machine learning. To put it in one word—comprehensive.
The book contains numerous code snippets that one can use to learn and also get a jump start in implementing projects. Using these examples, users also start to get good insights and learn the key steps in implementing a data science project—business understanding, data understanding, data preparation, modeling, evaluation and deployment.
Venkatraman Laxmikanth
Managing Director
Broadridge Financial Solutions India (Pvt) Ltd