Sign In Start Free Trial
Account

Add to playlist

Create a Playlist

Modal Close icon
You need to login to use this feature.
  • Book Overview & Buying Practical Machine Learning on Databricks
  • Table Of Contents Toc
Practical Machine Learning on Databricks

Practical Machine Learning on Databricks

By : Debu Sinha
4.4 (9)
close
close
Practical Machine Learning on Databricks

Practical Machine Learning on Databricks

4.4 (9)
By: Debu Sinha

Overview of this book

Unleash the potential of databricks for end-to-end machine learning with this comprehensive guide, tailored for experienced data scientists and developers transitioning from DIY or other cloud platforms. Building on a strong foundation in Python, Practical Machine Learning on Databricks serves as your roadmap from development to production, covering all intermediary steps using the databricks platform. You’ll start with an overview of machine learning applications, databricks platform features, and MLflow. Next, you’ll dive into data preparation, model selection, and training essentials and discover the power of databricks feature store for precomputing feature tables. You’ll also learn to kickstart your projects using databricks AutoML and automate retraining and deployment through databricks workflows. By the end of this book, you’ll have mastered MLflow for experiment tracking, collaboration, and advanced use cases like model interpretability and governance. The book is enriched with hands-on example code at every step. While primarily focused on generally available features, the book equips you to easily adapt to future innovations in machine learning, databricks, and MLflow.
Table of Contents (16 chapters)
close
close
1
Part 1: Introduction
4
Part 2: ML Pipeline Components and Implementation
8
Part 3: ML Governance and Deployment

Exploring clusters

Clusters are the primary computing units that will do the heavy lifting when you’re training your ML models. The VMs associated with a cluster are provisioned in Databricks users’ cloud subscriptions; however, the Databricks UI provides an interface to control the cluster type and its settings.

Clusters are ephemeral compute resources. No data is stored on clusters:

Figure 2.6 – The Clusters tab

Figure 2.6 – The Clusters tab

The Pools feature allows end users to create Databricks VM pools. One of the benefits of working in the cloud environment is that you can request new compute resources on demand. The end user pays by the second and returns the compute once the load on the cluster is low. This is great; however, requesting a VM from the cloud provider, ramping it up, and adding it to a cluster still takes some time. Using pools, you can pre-provision VMs to keep them in a standby state. If a cluster requests new nodes and has access...

CONTINUE READING
83
Tech Concepts
36
Programming languages
73
Tech Tools
Icon Unlimited access to the largest independent learning library in tech of over 8,000 expert-authored tech books and videos.
Icon Innovative learning tools, including AI book assistants, code context explainers, and text-to-speech.
Icon 50+ new titles added per month and exclusive early access to books as they are being written.
Practical Machine Learning on Databricks
notes
bookmark Notes and Bookmarks search Search in title playlist Add to playlist download Download options font-size Font size

Change the font size

margin-width Margin width

Change margin width

day-mode Day/Sepia/Night Modes

Change background colour

Close icon Search
Country selected

Close icon Your notes and bookmarks

Confirmation

Modal Close icon
claim successful

Buy this book with your credits?

Modal Close icon
Are you sure you want to buy this book with one of your credits?
Close
YES, BUY

Submit Your Feedback

Modal Close icon
Modal Close icon
Modal Close icon