Book Image

The Definitive Guide to Google Vertex AI

By : Jasmeet Bhatia, Kartik Chaudhary
4 (1)
Book Image

The Definitive Guide to Google Vertex AI

4 (1)
By: Jasmeet Bhatia, Kartik Chaudhary

Overview of this book

While AI has become an integral part of every organization today, the development of large-scale ML solutions and management of complex ML workflows in production continue to pose challenges for many. Google’s unified data and AI platform, Vertex AI, directly addresses these challenges with its array of MLOPs tools designed for overall workflow management. This book is a comprehensive guide that lets you explore Google Vertex AI’s easy-to-advanced level features for end-to-end ML solution development. Throughout this book, you’ll discover how Vertex AI empowers you by providing essential tools for critical tasks, including data management, model building, large-scale experimentations, metadata logging, model deployments, and monitoring. You’ll learn how to harness the full potential of Vertex AI for developing and deploying no-code, low-code, or fully customized ML solutions. This book takes a hands-on approach to developing u deploying some real-world ML solutions on Google Cloud, leveraging key technologies such as Vision, NLP, generative AI, and recommendation systems. Additionally, this book covers pre-built and turnkey solution offerings as well as guidance on seamlessly integrating them into your ML workflows. By the end of this book, you’ll have the confidence to develop and deploy large-scale production-grade ML solutions using the MLOps tooling and best practices from Google.
Table of Contents (24 chapters)
1
Part 1:The Importance of MLOps in a Real-World ML Deployment
4
Part 2: Machine Learning Tools for Custom Models on Google Cloud
14
Part 3: Prebuilt/Turnkey ML Solutions Available in GCP
18
Part 4: Building Real-World ML Solutions with Google Cloud

Limitations of ML

ML is very powerful, but it’s not the answer to every single problem. There are problems that ML is just not suitable for, and there are some cases where ML can’t be applied due to technical or business constraints. As an ML practitioner, it is important to develop the ability to find relevant business problems where ML can provide significant value instead of applying it blindly everywhere. Additionally, there are algorithm-specific limitations that can render an ML solution not applicable in some business applications. In this section, we will learn about some common limitations of ML that should be kept in mind while finding relevant use cases.

Keep in mind that the limitations we are discussing in this section are very general. In real-world applications, there are more limitations possible due to the nature of the problem we are solving. Some common limitations that we will discuss in detail are as follows:

  • Data-related concerns
  • Deterministic nature of problems
  • Lack of interpretability and reproducibility
  • Concerns related to cost and customizations
  • Ethical concerns and bias

Let’s now deep dive into each of these common limitations.

Data-related concerns

The quality of an ML model highly depends upon the quality of the training data it is provided with. Data present in the real world is often noisy, incomplete, unlabeled, and sometimes unusable. Moreover, most supervised learning algorithms require large amounts of properly labeled training data to produce good results. The training data requirements of some algorithms (e.g., deep learning) are so high that even manually labeling data is not an option. And even if we manage to label the data manually, it is often error-prone due to human bias.

Another major issue is incompleteness or missing data. For example, consider the problem of automatic speech recognition. In this case, model results are highly biased toward the accent present in the training dataset. A model that is trained on the American accent doesn’t produce good results on other accented speech. Since accents change significantly as we travel to different parts of the world, it is hard to gather and label relevant amounts of training data for every possible accent. For this reason, developing a single speech recognition model that works for everyone is not yet feasible, and thus, the tech giants providing speech recognition solutions often develop accent-specific models. Developing a new model for each new accent is not very scalable.

Deterministic nature of problems

ML has achieved great success in solving some highly complex problems, such as numerical weather prediction. One problem with most of the current ML algorithms is that they are stochastic in nature and thus cannot be trusted blindly when the problem is deterministic. Considering the case of numerical weather prediction, today we have ML models that can predict rain, wind speed, air pressure, and so on, with acceptable accuracy, but they completely fail to understand the physics behind real weather systems. For example, an ML model might provide negative value estimations of parameters such as density.

However, it is very likely that these kinds of limitations can be overcome in the near future. Future research in the field of ML might discover new algorithms that are smart enough to understand the physics of our world. Such models will open infinite possibilities in the future.

Lack of interpretability and reproducibility

One major issue with many ML algorithms (and often with neural networks) is the lack of interpretability of results. Many business applications, such as fraud detection and disease prediction, require a justification for model results. If an ML model classifies a financial transaction as fraud, it should also provide solid evidence for the decision; otherwise, this output may not be useful for the business. Deep learning or neural network models often lack interpretability, and the explainability of such models is an active area of research. Multiple methods have been developed for model interpretability or explainability purposes. Though these methods can provide some insights into the results, they are still far from the actual requirements.

Reproducibility, on the other hand, is another complex and growing issue with ML solutions. Some of the latest research papers might show us great improvements in results using some technological advancements on a fixed set of datasets, but the same method may not work in real-world scenarios. Secondly, ML models are often unstable, which means that they produce different results when trained on different partitions of the dataset. This is a challenging situation because models developed for one business segment may be completely useless for another business segment, even though the underlying problem statement is similar. This makes them less reusable.

Concerns related to cost and customizations

Developing and maintaining ML solutions is often expensive, more so in the case of deep learning algorithms. Development costs may come from employing highly skilled developers as well as the infrastructure needed for data analytics and ML experimentation. Deep learning models usually require high-compute resources such as GPUs and TPUs for training and experimentation. Running a hyperparameter tuning job with such models is even more costly and time-consuming. Once the model is ready for production, it requires dedicated resources for deployment, monitoring, and maintenance. This cost further increases as you scale your deployments to serve a large number of customers, and even more if there are very low latency concerns. Thus, it is very important to understand the value that our solution is going to bring before jumping into the development phase and check whether it is worth the investment.

Another concern with the ML solutions is their lack of customizations. ML models are often very difficult to customize, meaning it can be hard to change their parameters or make them adapt to new datasets. Pre-built general-purpose ML solutions often do not work well on specific business use cases, and this leaves them with two choices – either to develop the solution from scratch or customize the prebuilt general-purpose solutions. Though the customization of prebuilt models seems like a better choice here, even the customization is not easy in the case of ML models. ML model customization requires a skilled set of data engineers and ML specialists with a deep understanding of technical concepts such as deep learning, predictive modeling, and transfer learning.

Ethical concerns and bias

ML is quite powerful and is adopted today by many organizations to guide their business strategy and decisions. As we know, some of these ML algorithms are black boxes; they may not provide reasons behind their decisions. ML systems are trained on a finite set of datasets, and they may not apply to some real-world scenarios; if those scenarios are encountered in the future, we can’t tell what decision the ML system will take. There might be ethical concerns related to such black-box decisions. For example, if a self-driving car is involved in a road accident, whom should you blame – the driver, the team that developed the AI system, or the car manufacturer? Thus, it is clear that the current advancements in ML and AI are not suitable for ethical or moral decision-making. Also, we need a framework to solve ethical concerns involving ML and AI systems.

The accuracy and speed of ML solutions are often commendable, but these solutions cannot always be trusted to be fair and unbiased. Consider AI software that recognizes faces or objects in a given image; this system could go wrong on photos where the camera is not able to capture racial sensitivity properly, or it may classify a certain type of dog (that is somewhat similar to a cat) as a cat. This kind of bias may come from a biased set of training or testing datasets used for developing AI systems. Data present in the real world is often collected and labeled by humans; thus, the bias that exists in humans is transferred into AI systems. Avoiding bias completely is impossible as we all are humans and are thus biased, but there are measures that can be taken to reduce it. Establishing a culture of ethics and building teams from diverse backgrounds can be a good step to reduce bias to a certain extent.