Book Image

Learning Microsoft Cognitive Services - Second Edition

By : Leif Larsen
Book Image

Learning Microsoft Cognitive Services - Second Edition

By: Leif Larsen

Overview of this book

Microsoft has revamped its Project Oxford to launch the all new Cognitive Services platform-a set of 30 APIs to add speech, vision, language, and knowledge capabilities to apps. This book will introduce you to 24 of the APIs released as part of Cognitive Services platform and show you how to leverage their capabilities. More importantly, you'll see how the power of these APIs can be combined to build real-world apps that have cognitive capabilities. The book is split into three sections: computer vision, speech recognition and language processing, and knowledge and search. You will be taken through the vision APIs at first as this is very visual, and not too complex. The next part revolves around speech and language, which are somewhat connected. The last part is about adding real-world intelligence to apps by connecting them to Knowledge and Search APIs. By the end of this book, you will be in a position to understand what Microsoft Cognitive Service can offer and how to use the different APIs.
Table of Contents (19 chapters)
Title Page
Credits
About the Author
About the Reviewer
www.PacktPub.com
Customer Feedback
Preface

Knowing who is speaking


Using the Speaker Recognition API we can identify who is speaking. By defining one or more speaker profiles, with corresponding samples, we can identify if any of these is speaking at any time.

To be able to utilize this feature, we need to go through a few steps:

  1. We add one or more speaker profile to the service.
  2. Each speaker profile enrolls several spoken samples.
  3. We call the service to identify a speaker based on audio input.

Note

If you have not already done so, sign up for an API key for the Speaker Recognition API at https://portal.azure.com.

Start by adding a new NuGet package to your smart-house application. Search for and add Microsoft.ProjectOxford.SpeakerRecognition.

Add a new class called SpeakerIdentification to the Model folder of your project. This class will hold all the functionality related to speaker identification.

Beneath the class, we add another class, containing EventArgs for status updates:

    public class SpeakerIdentificationStatusUpdateEventArgs...