Book Image

Learning Microsoft Cognitive Services - Third Edition

By : Leif Larsen
Book Image

Learning Microsoft Cognitive Services - Third Edition

By: Leif Larsen

Overview of this book

Microsoft Cognitive Services is a set of APIs for integrating artificial intelligence in your applications to solve logical business problems. If you’re new to developing applications with AI, Learning Microsoft Cognitive Services will give you a comprehensive introduction to Microsoft’s AI stack and get you up-to-speed in no time. The book introduces you to 24 APIs, including Emotion, Language, Vision, Speech, Knowledge, and Search. Using Visual Studio, you can develop applications with enhanced capabilities for image processing, speech recognition, text processing, and much more. Moving forward, you will work with datasets that enable your applications to process various data in the form of image, video, or text. By the end of the book, you’ll be able to confidently explore Cognitive Services APIs for building intelligent applications that can be deployed for real-world business uses.
Table of Contents (17 chapters)
Learning Microsoft Cognitive Services - Third Edition
Contributors
Acknowledgments
Preface
Index

Knowing who is speaking


Using the Speaker Recognition API, we can identify who is speaking. By defining one or more speaker profiles with corresponding samples, we can identify whether any of them are speaking at any time.

To be able to utilize this feature, we need to go through a few steps:

  1. We need to add one or more speaker profiles to the service.

  2. Each speaker profile enrolls several spoken samples.

  3. We call the service to identify a speaker based on audio input.

If you have not already done so, sign up for an API key for the Speaker Recognition API at https://portal.azure.com.

Start by adding a new NuGet package to your smart-house application. Search for and add Microsoft.ProjectOxford.SpeakerRecognition.

Add a new class called SpeakerIdentification to the Model folder of your project. This class will hold all of the functionality related to speaker identification.

Beneath the class, we will add another class, containing EventArgs for status updates:

    public class SpeakerIdentificationStatusUpdateEventArgs...