Book Image

Spark for Data Science

By : Srinivas Duvvuri, Bikramaditya Singhal
Book Image

Spark for Data Science

By: Srinivas Duvvuri, Bikramaditya Singhal

Overview of this book

This is the era of Big Data. The words ‘Big Data’ implies big innovation and enables a competitive advantage for businesses. Apache Spark was designed to perform Big Data analytics at scale, and so Spark is equipped with the necessary algorithms and supports multiple programming languages. Whether you are a technologist, a data scientist, or a beginner to Big Data analytics, this book will provide you with all the skills necessary to perform statistical data analysis, data visualization, predictive modeling, and build scalable data products or solutions using Python, Scala, and R. With ample case studies and real-world examples, Spark for Data Science will help you ensure the successful execution of your data science projects.
Table of Contents (18 chapters)
Spark for Data Science
Credits
Foreword
About the Authors
About the Reviewers
www.PacktPub.com
Preface

About the Reviewers

Daniel Frimer has been involved in a vast exposure of industries across Healthcare, Web Analytics, Transportation.  Across these industries has developed ways to optimize the speed of data workflow, storage, and processing in the hopes of making a highly efficient department. Daniel is currently a Master’s candidate at the University of Washington in Information Sciences pursuing a specialization in Data Science and Business Intelligence. She worked on Python Data Science Essentials

I’d like to thank my grandmother Mary.  Who has always believed in mine and everyone’s potential and respects those whose passions make the world a better place.

Priyansu Panda is a research engineer at Underwriters Laboratories, Bangalore, India. He worked as a senior system engineer in Infosys Limited, and served as a software engineer in Tech Mahindra.

His areas of expertise include machine-learning, natural language processing, computer vision, pattern recognition, and heterogeneous distributed data integration. His current research is on applied machine learning for product safety analysis. His major research interests are machine-learning and data-mining applications, artificial intelligence on internet of things, cognitive systems, and clustering research.

Yogesh Tayal is a Technology Consultant at Mu Sigma Business Solutions Pvt. Ltd. and has been with Mu Sigma for more than 3 years. He has worked with the Mu Sigma Business Analytics team and is currently an integral part of the product development team. Mu Sigma is one of the leading Decision Sciences companies in India with a huge client base comprising of leading corporations across an array of industry verticals i.e. technology, retail, pharmaceuticals, BFSI, e-commerce, healthcare etc.