Kinect for Windows SDK Programming Guide

Kinect for Windows SDK Programming Guide

By : Abhijit Jana

Buy this Book

Kinect for Windows SDK Programming Guide

By: Abhijit Jana

Buy this Book

Overview of this book

Kinect has been a game-changer in the world of motion games and applications since its first release. It has been touted as a controller for Microsoft Xbox but is much more than that. The developer version of Kinect, Kinect for Windows SDK, provides developers with the tools to develop applications that run on Windows. You can use this to develop applications that make interaction with your computer hands-free. This book focuses on developing applications using the Kinect for Windows SDK. It is a complete end to end solution using different features of Kinect for Windows SDK with step by step guidance. The book will also help you develop motion sensitive and speech recognition enabled applications. You will also learn about building application using multiple Kinects.The book begins with explaining the different components of Kinect and then moves into to the setting up the device and getting thedevelopment environment ready. You will be surprised at how quickly the book takes you through the details of Kinect APIs. You will use NUI to use the Kinect for Natural Inputs like skeleton tracking, sensing, speech recognizing. You will capture different types of stream, and images, handle stream event, and capture frame. Kinect device contains a motorized tilt to control sensor angles, you will learn how to adjust it automatically. The last part of the book teaches you how to build application using multiple Kinects and discuss how Kinect can be used to integrate with other devices such as Windows Phone and microcontroller.

Kinect for Windows SDK Programming Guide

Credits

About the Author

Acknowledgement

About the Reviewers

www.PacktPub.com

Preface

Free Chapter

Understanding the Kinect Device

Components of Kinect for Windows

Kinect for Windows versus Kinect for Xbox

Where can you use Kinect

Summary

Getting Started

System requirements for the Kinect for Windows SDK

Evaluation of the Kinect for Windows SDK

Downloading the SDK and the Developer Toolkit

Installing Kinect for Windows SDK

Testing your device

Looking inside the Kinect SDK

Features of the Kinect for Windows SDK

The Kinect for Windows Developer Toolkit

Making your development setup ready

The Coding4Fun Kinect Toolkit

Summary

Starting to Build Kinect Applications

How applications interact with the Kinect sensor

Kinect Info Box – your first Kinect application

Dealing with the Kinect status

Summary

Getting the Most out of Kinect Camera

Understanding the Kinect image stream

Different ways of retrieving the color stream from Kinect

KinectCam – a Kinect camera application

Enabling the color stream channel

Looking inside color image stream helpers

Capturing frames on demand

Extending the KinectCam

Applying more effects to the camera

Seeing in low light

Making your application perform better

Using the Coding4Fun toolkit

Summary

The Depth Data – Making Things Happen

Understanding the depth data stream

Capturing and processing depth data

Looking inside depth image stream helpers

Depth data and distance

Working with depth range

Depth data distribution

Player index with depth data

Getting the depth and player index automatically

A 3D view of depth data

Summary

Human Skeleton Tracking

How skeleton tracking works

Skeleton tracking with the Kinect SDK

Start tracking skeleton joints

Flow – capturing skeleton data

An intrusion detector camera application

Looking inside skeleton stream helpers

Skeleton-tracking mode

Skeleton tracking in near mode

The Skeleton

Choosing which skeleton to track

The building blocks – Joints and JointCollection

Steps to be followed for joint tracking

Create your own joints data point

Bones – connecting joints

Adjusting the Kinect sensor automatically and giving live feedback to users

Skeleton smoothing – soften the skeleton's movement

Skeleton space transformation

The Advanced Skeleton Viewer application

Debugging the applications

Getting data frames together

Summary

Using Kinect's Microphone Array

Verifying the Kinect audio configuration

Using the Kinect microphone array with your computer

The Kinect SDK architecture for Audio

Kinect microphone array

Audio signal processing in Kinect

Taking control over the microphone array

Kinect sound recorder – capturing Kinect audio data

Processing the audio data

Sound source localization

Summary

Speech Recognition

How speech recognition works

Using Kinect with your Windows PC speech recognition

Beginning with Microsoft Speech API (SAPI)

Draw What I Want – a speech-enabled application

Summary

Building Gesture-controlled Applications

What is a gesture

Approaches for gesture recognition

Basic gesture recognition

Algorithmic gesture recognition

Weighted network gesture recognition

Template-based gesture recognition

Building gesture-enabled controls

The Basic Interaction – a WPF application

Key things to remember

Summary

Developing Applications Using Multiple Kinects

Setting up the environment for multiple Kinects

Multiple Kinects – how to reduce interference

Detecting multiple Kinects

Developing an application with multiple Kinects

Controlling multiple sensor status changes

Handling a failover scenario using Kinects

Challenges faced in developing applications using multiple Kinects

Applications where multiple Kinects can be used

Summary

Putting Things Together

Taking Kinect to the Cloud

Remotely using the Kinect with Windows Phone

Using Kinect with a Netduino microcontroller

Augmented reality applications

Working with face tracking

Working with XNA and a 3D avatar

Summary

Index

Customer Reviews

5 star

4 star

3 star

2 star

1 star

Components of Kinect for Windows

Kinect is a horizontal device with depth sensors, color camera, and a set of microphones with everything secured inside a small, flat box. The flat box is attached to a small motor working as the base that enables the device to be tilted in a horizontal direction. The Kinect sensor includes the following key components:

Color camera
Infrared (IR) emitter
IR depth sensor
Tilt motor
Microphone array
LED

Apart from the previously mentioned components, the Kinect device also has a power adapter for external power supply and a USB adapter to connect with a computer. The following figure shows the different components of a Kinect sensor:

Inside the Kinect sensor

From the outside, the Kinect sensor appears to be a plastic case with three cameras visible, but it has very sophisticated components, circuits, and algorithms embedded. If you remove the black plastic cover from the Kinect device, what will you see? The hardware components that make the Kinect sensor work.

The following image shows a front view of a Kinect sensor that's been unwrapped from its black case. Take a look (from left to right) at its IR emitter, color camera, and IR depth sensor:

Let's move further and discuss about component.

The color camera

This color camera is responsible for capturing and streaming the color video data. Its function is to detect the red, blue, and green colors from the source. The stream of data returned by the camera is a succession of still image frames. The Kinect color stream supports a speed of 30 frames per second (FPS) at a resolution of 640 x 480 pixels, and a maximum resolution of 1280 x 960 pixels at up to 12 FPS. The value of frames per second can vary depending on the resolution used for the image frame.

The viewable range for the Kinect cameras is 43 degrees vertical by 57 degrees horizontal. The following figure shows an illustration of the viewable range of the Kinect camera:

The following image shows a color image that was captured using Kinect color sensors with a resolution of 640 x 480 pixels:

IR emitter and IR depth sensor

Kinect depth sensors consist of an IR emitter and an IR depth sensor. Both of them work together to make things happen. The IR emitter may look like a camera from the outside, but it's an IR projector that constantly emits infrared light in a "pseudo-random dot" pattern over everything in front of it. These dots are normally invisible to us, but it is possible to capture their depth information using an IR depth sensor. The dotted light reflects off different objects, and the IR depth sensor reads them from the objects and converts them into depth information by measuring the distance between the sensor and the object from where the IR dot was read. The following figure shows how the overall depth sensing looks:

Note

It is quite fun and entertaining to know that these infrared dots can be seen by you. All we need is a night vision camera or goggles.

The depth data stream supports a resolution of 640 x 480 pixels, 320 x 240 pixels, and 80 x 60 pixels, and the sensor viewable range remains the same as the color camera.

The following image shows depth images that are captured from the depth image stream:

How depth data processing works

The Kinect sensor has the ability to capture a raw, 3D view of the objects in front of it, regardless of the lighting conditions of the room. It uses an infrared (IR) emitter and an IR depth sensor that is a monochrome CMOS (Complimentary Metal-Oxide-Semiconductor) sensor. The backbone behind this technology is from PrimeSense, and the following diagram shows how this works:

The sequence explained in the diagram is as follows:

When there is a need to capture depth data, the PrimeSense chip sends a signal to the infrared emitter to turn on the infrared light (1), and sends another signal to the IR depth sensor to initiate depth data capture from the current viewable range of the sensor (2). The IR emitter meanwhile starts sending an infrared light invisible to human eyes (3) to the objects in front of the device. The IR depth sensor starts reading the inferred data from the object based on the distance of the individual light points of reflection (4) and passes it to the PrimeSense chip (5). The PrimeSense chip then analyzes the captured data, and creates a per-frame depth image and passes it to the output depth stream as a depth image (6).

Note

The IR emitter emits an electromagnetic radiation. The wavelengths of the radiations are longer than the wavelength of the visible light, which makes the sensor's IR lights invisible. The wavelengths need to be consistent to minimize the noise within the captured data. Heat generated by the laser diode when the Kinect sensor is running can impact the wavelength. The Kinect sensor has a small, inbuilt fan to normalize the temperature and ensure that the wavelengths are consistent.

Tilt motor

The base and body part of the sensor are connected by a tiny motor. It is used to change the camera and sensor's angles, to get the correct position of the human skeleton within the room. The following image shows the motor along with three gears that enable the sensor to tilt at a specified range of angles:

The motor can be tilted vertically up to 27 degrees, which means that the Kinect sensor's angles can be shifted upwards or downwards by 27 degrees. The following figure shows an illustration of the angle being changed when the motor is tilted:

Note

Do not physically force the device into a specific angle. The Kinect for Windows SDK has a few specific APIs that can help us control the sensor's motor tilting. Do not tilt the Kinect motor frequently; use this as few times as possible and only when it's required.

Microphone array

The Kinect device exhibits great support for audio with the help of a microphone array. The microphone array consists of four different microphones that are placed in a linear order (three of them are spread on the right side and the other one is placed on the left side, as shown in the following image) at the bottom of the Kinect sensor:

The purpose of the microphone array is not just to let the Kinect device capture the sound but to also locate the direction of the audio wave. The main advantages of having an array of microphones over a single microphone are that capturing and recognizing the voice is done more effectively with enhanced noise suppression, echo cancellation, and beam-forming technology. This enables Kinect to be a highly bidirectional microphone that can identify the source of the sound and recognize the voice irrespective of the noise and echo present in the environment:

LED

An LED is placed in between the camera and the IR projector. It is used for indicating the status of the Kinect device. The green color of the LED indicates that the Kinect device drivers have loaded properly. If you are plugging Kinect into a computer, the LED will start with a green light once your system detects the device; however for full functionality of your device, you need to plug the device into an external power source.

Kinect for Windows SDK Programming Guide

By : Abhijit Jana

Kinect for Windows SDK Programming Guide

By: Abhijit Jana

Overview of this book

Related Content you might be interested in

Current Title:

Kinect for Windows SDK Programming Guide

Components of Kinect for Windows

Inside the Kinect sensor

The color camera

IR emitter and IR depth sensor

Note

How depth data processing works

Note

Tilt motor

Note

Microphone array

LED