Book Image

Hands-on DevOps

By : Sricharan Vadapalli
Book Image

Hands-on DevOps

By: Sricharan Vadapalli

Overview of this book

<p>DevOps strategies have really become an important factor for big data environments.</p> <p>This book initially provides an introduction to big data, DevOps, and Cloud computing along with the need for DevOps strategies in big data environments. We move on to explore the adoption of DevOps frameworks and business scenarios. We then build a big data cluster, deploy it on the cloud, and explore DevOps activities such as CI/CD and containerization. Next, we cover big data concepts such as ETL for data sources, Hadoop clusters, and their applications. Towards the end of the book, we explore ERP applications useful for migrating to DevOps frameworks and examine a few case studies for migrating big data and prediction models.</p> <p>By the end of this book, you will have mastered implementing DevOps tools and strategies for your big data clusters.</p>
Table of Contents (22 chapters)
Title Page
Credits
About the Author
About the Reviewers
www.PacktPub.com
Customer Feedback
Preface
11
DevOps Adoption by ERP Systems
12
DevOps Periodic Table
13
Business Intelligence Trends
14
Testing Types and Levels
15
Java Platform SE 8

DevOps for data science


Data science projects involve majorly the multiple streams of roles performing different functions of big data engineers, data scientists, and operations teams. For data engineers, primary activities include ETL, preparing data sets for analysis, and coding for the models developed by data scientists into scripts. Data scientists are involved in developing the models, evaluating different algorithms and models based on sample test data and validating them with real data.

In this silo-working scenario, the team output may be confined to Poc's and not extend to big projects. However, even the minor tasks require overlap of skills as well as multiple interactions and design sessions with big data engineers, data scientists, and operations teams.

DevOps can bridge the gap between the streams for collaborative working by adopting best practices:

  • Tools and platforms evolution: Engineers and data scientists should continuously evaluate and contribute to the creation of new...