Book Image

Managing Data Science

By : Kirill Dubovikov
Book Image

Managing Data Science

By: Kirill Dubovikov

Overview of this book

Data science and machine learning can transform any organization and unlock new opportunities. However, employing the right management strategies is crucial to guide the solution from prototype to production. Traditional approaches often fail as they don't entirely meet the conditions and requirements necessary for current data science projects. In this book, you'll explore the right approach to data science project management, along with useful tips and best practices to guide you along the way. After understanding the practical applications of data science and artificial intelligence, you'll see how to incorporate them into your solutions. Next, you will go through the data science project life cycle, explore the common pitfalls encountered at each step, and learn how to avoid them. Any data science project requires a skilled team, and this book will offer the right advice for hiring and growing a data science team for your organization. Later, you'll be shown how to efficiently manage and improve your data science projects through the use of DevOps and ModelOps. By the end of this book, you will be well versed with various data science solutions and have gained practical insights into tackling the different challenges that you'll encounter on a daily basis.
Table of Contents (18 chapters)
Free Chapter
1
Section 1: What is Data Science?
5
Section 2: Building and Sustaining a Team
9
Section 3: Managing Various Data Science Projects
14
Section 4: Creating a Development Infrastructure

Packaging code

When deploying Python code for data science projects, you have several options:

  • Regular Python scripts: You just deploy a bunch of Python scripts to the server and run them. This is the simplest form of deployment, but it requires a lot of manual preparation: you need to install all required packages, fill in configuration files, and so on. While those actions can be automated by using tools such as Ansible (https://www.ansible.com/), it's not recommended to use this form of deployment for anything but the simplest projects with no long-term maintainability goals.
  • Python packages: Creating a Python package using a setup.py file is a much more convenient way to package Python code. Tools such as PyScaffold provide ready-to-use templates for Python packages, so you won't need to spend much time structuring your project. In the case of Python packages, Ansible...