Book Image

Feature Store for Machine Learning

By : Jayanth Kumar M J
Book Image

Feature Store for Machine Learning

By: Jayanth Kumar M J

Overview of this book

Feature store is one of the storage layers in machine learning (ML) operations, where data scientists and ML engineers can store transformed and curated features for ML models. This makes them available for model training, inference (batch and online), and reuse in other ML pipelines. Knowing how to utilize feature stores to their fullest potential can save you a lot of time and effort, and this book will teach you everything you need to know to get started. Feature Store for Machine Learning is for data scientists who want to learn how to use feature stores to share and reuse each other's work and expertise. You’ll be able to implement practices that help in eliminating reprocessing of data, providing model-reproducible capabilities, and reducing duplication of work, thus improving the time to production of the ML model. While this ML book offers some theoretical groundwork for developers who are just getting to grips with feature stores, there's plenty of practical know-how for those ready to put their knowledge to work. With a hands-on approach to implementation and associated methodologies, you'll get up and running in no time. By the end of this book, you’ll have understood why feature stores are essential and how to use them in your ML projects, both on your local system and on the cloud.
Table of Contents (13 chapters)
1
Section 1 – Why Do We Need a Feature Store?
4
Section 2 – A Feature Store in Action
9
Section 3 – Alternatives, Best Practices, and a Use Case

Conventions used

There are a number of text conventions used throughout this book.

Code in text: Indicates code words in text, database table names, folder names, filenames, file extensions, pathnames, dummy URLs, user input, and Twitter handles. Here is an example: "The preceding code block scales the numerical columns: tenure, MonthlyCharges, and TotalCharges."

A block of code is set as follows:

le = LabelEncoder()
for i in bin_cols:
    churn_data[i] = le.fit_transform(churn_data[i])

When we wish to draw your attention to a particular part of a code block, the relevant lines or items are set in bold:

project: customer_segmentation
registry: data/registry.db
provider: aws
online_store:
  type: dynamodb
  region: us-east-1

Any command-line input or output is written as follows:

$ docker build -t customer-segmentation .

Bold: Indicates a new term, an important word, or words that you see onscreen. For instance, words in menus or dialog boxes appear in bold. Here is an example: "On the cluster home page, select the Properties tab and scroll down to Associated IAM roles."

Tips or Important Notes

Appear like this.