Book Image

Analytics for the Internet of Things (IoT)

By : Andrew Minteer
5 (1)
Book Image

Analytics for the Internet of Things (IoT)

5 (1)
By: Andrew Minteer

Overview of this book

We start with the perplexing task of extracting value from huge amounts of barely intelligible data. The data takes a convoluted route just to be on the servers for analysis, but insights can emerge through visualization and statistical modeling techniques. You will learn to extract value from IoT big data using multiple analytic techniques. Next we review how IoT devices generate data and how the information travels over networks. You’ll get to know strategies to collect and store the data to optimize the potential for analytics, and strategies to handle data quality concerns. Cloud resources are a great match for IoT analytics, so Amazon Web Services, Microsoft Azure, and PTC ThingWorx are reviewed in detail next. Geospatial analytics is then introduced as a way to leverage location information. Combining IoT data with environmental data is also discussed as a way to enhance predictive capability. We’ll also review the economics of IoT analytics and you’ll discover ways to optimize business value. By the end of the book, you’ll know how to handle scale for both data storage and analytics, how Apache Spark can be leveraged to handle scalability, and how R and Python can be used for analytic modeling.
Table of Contents (20 chapters)
Title Page
Credits
About the Author
About the Reviewer
www.PacktPub.com
Customer Feedback
Preface

Defining IoT analytics


In order to understand IoT analytics, it is helpful to separate it out and define both analytics and the IoT. This will help frame the discussion for the rest of the book.

Defining analytics

If you ask a hundred people to define analytics, you are likely to get a hundred different answers. Each person tends to have his or her own definition in mind that can range from static reports to advanced deep learning expert systems. All tend to call efforts in the wide ranging territory analytics without much further explanation.

We will take a fairly broad definition in this book as we are covering quite a bit of territory. In their best selling book Competing on Analytics, Tom Davenport and Jeanne Harris created a scale, which they called Analytics Maturity. Companies progress to higher levels in the scale as their use of analytics matures, and they begin to compete with other companies by leveraging it.

When we use the word analytics, we will mean using techniques that fall in the range from query/drill down to optimization as shown in the following chart from Competing on Analytics:

We will also take a slightly different philosophy. Unlike the notion of a company progressing through each level to get to the peak of maturity at the upper right with optimization, we will strive to reach success at all levels in parallel.

The idea of a company not being analytically mature unless it is actively employing optimization models at every turn can be dangerous. This puts pressure on a company to focus time and resources where there may not be a return on investment (ROI) for them. Since resources are always limited, this could also cause them to under-invest in projects in other areas that have a higher ROI.

The reason for the lack of ROI is often that a company simply does not have the right data to take full advantage of the more advanced techniques. This could be no fault of their own as the signal in the noise may be just too weak to tease out. This could stem from the state of technology, not yet at the point where the key predictive data can even be monitored. Or even if this is possible, it may be far too expensive to justify capturing it. We will talk about the limitations of available data quite a bit in this book. The goal will always be to maximize ROI at all levels of the maturity model.

We will also take the view that analytics maturity is about having the capability and knowing how to enable the full scale. It is not about what you are doing. It is about what you are capable of doing in order to maximize your sum total ROI across the full scale. Each level can be exploited if an opportunity is spotted. And we want there to be fertile ground for opportunities across the full scale. More about this will be covered throughout the book.

Defining the Internet of Things

Sensors have been tracking data for decades at manufacturing plants, retail stores, and remote oil and gas equipment. Why all of sudden is there this IoT hype all over the media?

The dramatic decrease in sensor costs, bandwidth costs, the spread of cellular coverage, and the rise of cloud computing all combine to create fertile conditions to easily connect devices over the internet. For example, as shown in the following graph, Goldman Sachs predicts an average sensor cost in 2020 of under $0.40 USD, 30% of what it was in 2004. Whether all these devices should be connected or not is hotly debated:

Data source: Goldman Sachs, BI estimates

The definitions of IoT seem to vary quite a bit; some include machine sensors only, others include RFID tags and smartphones.

We will use this definition from Forrest Stroud on Webopedia:

The IoT refers to the ever-growing network of physical objects that feature an IP address for the internet connectivity and the communication that occurs between these objects and other internet-enabled devices and systems.

Or to get even more basic: stuff that talks to other stuff over the internet without requiring you to do anything. This clears it up, right?

Even the number of things projected to be connected by 2020 varies widely. Some sources project 20.8 billion devices, others project up to 50 billion - over twice the amount.

For our purposes, we are more concerned with how to analyze the data generated than we are about the scope of devices that should be considered part of the IoT. If something sends data remotely by way of the internet, it is fair game for us, especially if it is machine-generated on one end and machine-consumed on the other.

We are more concerned with how to extract value from the data and adapt to circumstances inherent to it. IoT is not really new, as elements of it have been developing for decades. Remote detection of oil well spills was happening in the 1970s. GPS-based vehicle telematics has been around for 20 years. IoT is also not a separate market; it blends into current products and processes. Although much of the media reports on it as if it is a different animal (perhaps even the author of this book - guilty as charged?), you should not think of it this way.

The concept of constrained

The term constrained is an important concept in understanding IoT devices, data, and impacts on analytics. It refers to the limited battery power, bandwidth, and hardware capability that has to be considered in the design of IoT devices. For many IoT use cases, one or more of these has to be balanced with the need to record useful data.