Book Image

Learn TensorFlow Enterprise

By : KC Tung
Book Image

Learn TensorFlow Enterprise

By: KC Tung

Overview of this book

TensorFlow as a machine learning (ML) library has matured into a production-ready ecosystem. This beginner’s book uses practical examples to enable you to build and deploy TensorFlow models using optimal settings that ensure long-term support without having to worry about library deprecation or being left behind when it comes to bug fixes or workarounds. The book begins by showing you how to refine your TensorFlow project and set it up for enterprise-level deployment. You’ll then learn how to choose a future-proof version of TensorFlow. As you advance, you’ll find out how to build and deploy models in a robust and stable environment by following recommended practices made available in TensorFlow Enterprise. This book also teaches you how to manage your services better and enhance the performance and reliability of your artificial intelligence (AI) applications. You’ll discover how to use various enterprise-ready services to accelerate your ML and AI workflows on Google Cloud Platform (GCP). Finally, you’ll scale your ML models and handle heavy workloads across CPUs, GPUs, and Cloud TPUs. By the end of this TensorFlow book, you’ll have learned the patterns needed for TensorFlow Enterprise model development, data pipelines, training, and deployment.
Table of Contents (15 chapters)
1
Section 1 – TensorFlow Enterprise Services and Features
4
Section 2 – Data Preprocessing and Modeling
7
Section 3 – Scaling and Tuning ML Works
10
Section 4 – Model Optimization and Deployment

Converting distributed CSV files to a TensorFlow dataset

If you are not sure about the data size, or are unsure as to whether it can all fit in the Python runtime's memory, then reading the data into a pandas DataFrame is not a viable option. In this case, we may use a TF dataset to directly access the data without opening it. 

Typically, when data is stored in a storage bucket as parts, the naming convention follows a general pattern. This pattern is similar to that of a Hadoop Distributed File System (HDFS), where the data is stored in parts and the complete data can be inferred via a wildcard symbol, *

When storing distributed files in a Google Cloud Storage bucket, a common pattern for filenames is as follows:

<FILE_NAME>-<pattern>-001.csv
…
<FILE_NAME>-<pattern>-00n.csv

Alternatively, there is the following pattern: 

<FILE_NAME>-<pattern>-aa.csv
…
<FILE_NAME>-<pattern>-zz.csv
...