Book Image

Google Cloud AI Services Quick Start Guide

By : Arvind Ravulavaru
Book Image

Google Cloud AI Services Quick Start Guide

By: Arvind Ravulavaru

Overview of this book

<p>Cognitive services are the new way of adding intelligence to applications and services. Now we can use Artificial Intelligence as a service that can be consumed by any application or other service, to add smartness and make the end result more practical and useful.</p> <p>Google Cloud AI enables you to consume Artificial Intelligence within your applications, from a REST API.  Text, video and speech analysis are among the powerful machine learning features that can be used. This book is the easiest way to get started with the Google Cloud AI services suite and open up the world of smarter applications.</p> <p>This book will help you build a Smart Exchange, a forum application that will let you upload videos, images and perform text to speech conversions and translation services. You will use the power of Google Cloud AI Services to make our simple forum application smart by validating the images, videos, and text provided by users to Google Cloud AI Services and make sure the content which is uploaded follows the forum standards, without a human curator involvement.</p> <p>You will learn how to work with the Vision API, Video Intelligence API, Speech Recognition API, Cloud Language Process, and Cloud Translation API services to make your application smarter.</p> <p>By the end of this book, you will have a strong understanding of working with Google Cloud AI Services, and be well on the way to building smarter applications.</p>
Table of Contents (9 chapters)

Google Cloud AI

Now that we understand what Cognition/AI on cloud ;is and why we need it, let's get started with learning the various Google Cloud AI services that are offered.

We have been briefly introduced to Google Cloud AI services in the GCP services section. Now let's dive deep into its offering.

In the next few subsections, we will be going through each of the services under the Google Cloud AI vertical.

Cloud AutoML Alpha

As of April 2018, Cloud AutoML is in alpha and is only available on request, subject to GCP terms and conditions.

AutoML helps us develop custom machine learning models with minimal ML knowledge and experience, using the power of Google's transfer learning and Neural Architecture Search technology.

Under this service, the first custom service that Google is releasing is named AutoML Vision. This service will help users to train custom vision models for their own use cases.

There are other services that will follow.

Some of the key AutoML features are the following:

  • Integration with human labeling
  • Powered by Google's Transfer Learning and AutoML
  • Fully integrated with other services of Google Cloud

You can read more about AutoML here: https://cloud.google.com/automl/.

Cloud TPU Beta

As of today, this service is in beta, but we need to explicitly request a TPU quota for our processing needs.

Using the Cloud TPUs, one can easily request large computation power to run our own machine learning algorithms. This service helps us with not only the required computing, but by using Google's TensorFlow, we can accelerate the complete setup.

This service can be used to perform heavy-duty machine learning, both training and prediction.

Some of the key Cloud TPU features are the following:

  • High performance
  • Utilizing the power of GCP
  • Referencing data models
  • Fully Integrated with other services of Google Cloud
  • Connecting Cloud TPUs to custom machine types

You can read more about Cloud TPU here: https://cloud.google.com/tpu/.

Cloud Machine Learning Engine

Cloud Machine Learning Engine helps us easily build machine learning models that work on any type of data, of any size. Cloud Machine Learning Engine can take any TensorFlow model and perform large-scale training on a managed cluster. Additionally, it can also manage the trained models for large-scale online and batch predictions.

Cloud Machine Learning Engine can seamlessly transition from training to prediction, using online and batch prediction services. Cloud Machine Learning Engine uses the same scalable and distributed infrastructure with GPU acceleration that powers Google ML products.

Some of the key Cloud Machine Learning Engine features are the following:

  • Fully integrated with other Google Cloud services
  • Discover and Share Samples
  • HyperTune your models
  • Managed and Scalable Service
  • Notebook Developer Experience
  • Portable Models

You can read more about Cloud Machine Learning Engine here: https://cloud.google.com/ml-engine/.

Cloud Job Discovery Private Beta

Matching qualified people with the right people doesn't have to be so hard; that is the premise of Cloud Job Discovery.

Today's job portals and career sites search people for a job role based on keywords. This approach most of the time results in a mismatch of the candidate to the role. That is where Cloud Job Discovery comes into the picture to bridge the gap between employer and employee. Job Discovery provides plug-and-play access to Google's search and machine learning capabilities, enabling the entire recruiting ecosystem—company career sites, job boards, applicant-tracking systems, and staffing agencies—to improve job site engagement and candidate conversion.

Before we continue, you can navigate to https://cloud.google.com/job-discovery/ and try out the Job Discovery Demo. You should see results based on your selection, similar to the following screenshot:

The key takeaway from the demo is how Discovery relates a profile to a keyword.

This diagram explains how Cloud Job Discovery works:

Some of the key differences of Cloud Job Discovery over a standard keyword search are the following:

  • Keyword matching
  • Company jargon recognition
  • Abbreviation recognition
  • Commute search
  • Spelling correction
  • Concept recognition
  • Title detection
  • Real-time query broadening
  • Employer recognition
  • Job enrichment
  • Advanced location mapping
  • Location expansion
  • Seniority alignment

Dialogflow Enterprise Edition Beta

Dialogflow is a development suite which is used for building interfaces for websites, mobile applications, some of the popular machine learning platforms, and IoT devices.

It is powered by machine learning to recognize the intent and context of what a user says, allowing your conversational interface to provide highly efficient and accurate responses. Natural language understanding recognizes a user's intent and extracts prebuilt entities such as time, date, and numbers. You can train your agent to identify custom entity types by providing a small dataset of examples.

This service offers cross-platform and multi-language support and can work well with the Google Cloud speech service.

You can read more about Dialogflow Enterprise Edition here: https://cloud.google.com/dialogflow-enterprise/.

Cloud Natural Language

Google's Cloud Natural Language service helps us better understand the structure and meaning of a piece of text by providing powerful machine learning models.

These models can be queried by REpresentational State Transfer (REST) API. We can use it to understand sentiment about our product on social media, or parse intent from customer conversations happening in a call center or through a messaging app.

Before we continue with Cloud Natural Language, I would recommend heading over to https://cloud.google.com/natural-language/ and trying out the API. Here is a quick glimpse of it:

As we can see from the previous screenshot, this service offers various insights regarding a piece of text.

Some of the key features are:

  • Syntax analysis
  • Entity recognition
  • Sentiment analysis
  • Content classification
  • Multi-language
  • Integrated REST API

You can read more about Cloud Natural Language service here: https://cloud.google.com/natural-language/.

Cloud Speech API

Cloud Speech API uses powerful neural network models to convert audio to text in real time. This service is exposed as a REST API, as we have seen with the Google Cloud Natural Language API.

This API can recognize over 110 languages and users can use this service to convert speech to text in real time, recognize audio uploaded in the request, and integrate with our audio storage on Google Cloud Storage, by using the same technology Google uses to power its own products.

Before we continue with Cloud Speech API, I would recommend heading over to https://cloud.google.com/speech/ and trying out the API. Here is a quick glimpse of it:

I was actually playing a song in the background and tried the speech-to-text. I was very impressed with the results, except for one part, where I said with a song playing and the API represented it as with the song playing; still, pretty good!

I think it is only a matter of time and continued use of these services that will increase their accuracy.

Some of the key features of Cloud Speech API are:

  • Automatic Speech Recognition (ASR)
  • Global vocabulary
  • Streaming recognition
  • Word hints
  • Real-time or prerecorded audio support
  • Noise robustness
  • Inappropriate content filtering
  • Integrated API

You can read more about Cloud Speech API here: https://cloud.google.com/speech/.

Cloud Translation API

Using the state-of-the-art Neural Machine Translation, the Cloud Translation service converts texts from one language to another.

Translation API is highly responsive, so websites and applications can integrate with Translation API for fast, dynamic translation of source text from the source language to a target language.

Before we continue with Cloud Translation API, I would recommend heading over to https://cloud.google.com/translate/ and trying out the API. Here is a quick glimpse of it, as shown in the following screenshot:

Some of the key features of Cloud Translation API are as follows:

  • Programmatic access – REST API-driven
  • Text translation
  • Language detection
  • Continuous updates

You can read more about Cloud Translate API here: https://cloud.google.com/translate/.

Cloud Vision API

Fred R. Barnard of Printers' Ink stated "A picture is worth ten thousand words".

But no one really knows what those words are. Here comes the Google Cloud Vision API to decipher that for us.

Cloud Vision API takes an image as input and spits out the contents of the image as text. It can understand the contents of the image. And this service can be accessed over REST API.

Before we continue with Cloud Vision API, I would recommend heading over to https://cloud.google.com/vision/ and trying out the API. Here is a quick glimpse of it as shown in the screenshot:

That is a photo of me when I was going through a trying-to-grow-long-hair phase, and after having fun at the beach. What is important is how the vision service was able to look at the image and detect my mood.

The same service can perform label detection as well as detect web entities related to this image among others.

Some of the key features of this service are:

  • Detecting explicit content
  • Detecting logos, labels, landmarks
  • Landmark detection
  • Optical character recognition
  • Face detection
  • Image attributes
  • Integrated REST API

To find out more about Cloud Vision API, check this out: https://cloud.google.com/vision/.

Cloud Video Intelligence

Cloud Video Intelligence is one of the latest cognitive services released by Google. Cloud Video Intelligence API does almost all the things that the Cloud Vision API can do, but on videos.

This service extracts the metadata from a video frame by frame, and we can search any moment of the video file.

Before we continue with Cloud Video Intelligence, I would recommend heading over to https://cloud.google.com/video-intelligence/ and trying out the API. Here is a quick glimpse of it, as shown in the screenshot:

I have selected the dinosaur and the bicycle video, and you can see the analysis.

Some of the key features of Cloud Video Intelligence are:

  • Label detection
  • Shot change detection
  • Explicit content detection
  • Video transcription Alpha

This concludes the overview of the various services offered as part of the Cloud AI vertical.

In this book, we are going to use a few of these to make a simple web application smart.