Book Image

Analytics for the Internet of Things (IoT)

By : Andrew Minteer
5 (1)
Book Image

Analytics for the Internet of Things (IoT)

5 (1)
By: Andrew Minteer

Overview of this book

We start with the perplexing task of extracting value from huge amounts of barely intelligible data. The data takes a convoluted route just to be on the servers for analysis, but insights can emerge through visualization and statistical modeling techniques. You will learn to extract value from IoT big data using multiple analytic techniques. Next we review how IoT devices generate data and how the information travels over networks. You’ll get to know strategies to collect and store the data to optimize the potential for analytics, and strategies to handle data quality concerns. Cloud resources are a great match for IoT analytics, so Amazon Web Services, Microsoft Azure, and PTC ThingWorx are reviewed in detail next. Geospatial analytics is then introduced as a way to leverage location information. Combining IoT data with environmental data is also discussed as a way to enhance predictive capability. We’ll also review the economics of IoT analytics and you’ll discover ways to optimize business value. By the end of the book, you’ll know how to handle scale for both data storage and analytics, how Apache Spark can be leveraged to handle scalability, and how R and Python can be used for analytic modeling.
Table of Contents (20 chapters)
Title Page
Credits
About the Author
About the Reviewer
www.PacktPub.com
Customer Feedback
Preface

The situation


The tense white-yellow of the fluorescent ceiling lights press down on you while you sit in your cubicle and stare at the monitors on your desk. You sense it is now night outside but can't see over the fabric walls to know for sure. You stare at the long list of filenames on one screen and the plain text rows of opaque sensor data on the other screen.

Your boss had just left to angrily brood somewhere in the office, and you are not sure where. He had been glowering over your shoulder.

"We spent $20 million in telecommunication and consulting fees last year just to get this data! The hardware costs $20 per unit. We've been getting data, and it has been piling up costing us $10,000 a month. There are 20 TB of files - that's big data, isn't it? And we can't seem to do anything with it?"he had said.

"This is ridiculous!," he continued, "It was supposed to generate $100 million in new revenue. Where is our first dollar? Why can't you do anything with it? I have five consultants a week calling me to tell me they can handle it- they'll even automate it. Maybe we should just pick one and hope they aren't selling us snake oil."

You know he does not really blame you. You were a whiz with Excel and knew how to query databases. A lot of analytics requests went to you. When the CEO decided the company needed a big data guy, they hired a VP out of Silicon Valley. But the new VP ended up taking a position with a different Silicon Valley company the day before he was supposed to start at your company.

You were hastily moved into the new analytics group. A group of one - you. It was to be a temporary shift until another VP was found. That was six months ago. The company is freezing funds for outside training and revenues are looking tight. So, no training for you.

Although many know the terms, no one in the company actually understands what Hadoop is or how to even start using this thing called machine learning. But others more and more seem to expect you to not only know it but already be doing it.

Executives have been reading articles in HBR and Forbes about the huge potential of the IoT combined with Artificial Intelligence or AI. They feel like the company will be left behind, and soon, if it does not have its own IoT big data solution incorporating AI. Your boss is feeling the pressure. Executives have several ideas for him where AI can be used. They seem to think that getting the idea is the hard part, implementation should be easy. Your boss is worried about his job and it rolls downhill to you.

Your screen on the left looks like this:

The list goes on and on for several pages. You have been able to combine several files and do some pivot tables and charting in Excel. But it takes a lot of your time, and you can only realistically handle a month or two worth of data. The questions are coming in faster than your ability to answer them. You and your boss have been talking about bringing in temps to do the work–they don't really need to understand it, just follow the steps that you outline for them.

Your screen on the right looks like this:

Your IT department has been consolidating lots of little files into several very large ones. The filesystem was being overloaded by the number of files, so the solution was to consolidate. Unfortunately, for you, many files are now too large to open in Excel, which limits what you can do with them. You end up doing more analytics on recent data simply because it is much easier (the files are still small).

Looking at the data rows, it is not obvious what you can do with it beyond sums and averages. The files are too big to do a VLOOKUP in Excel against something like your production records - which is stored in files often too big to even open in Excel.

At this point, you can't begin to think how you would apply Machine Learning to this data. You are not quite sure what it even means. You know the data is difficult to manipulate for anything beyond recent datasets. Surely, long periods of time would be needed to extract value out of it.

You hear a cough from behind you. Your boss is back.

He says quietly and stiffly, "I'm sorry. We're going to have to hire a consultant to take this over. I know how hard you've been working. You've done some amazing things considering the limitations, and nobody appreciates that enough. But I have to show results. It will probably take a month or two to fully bring someone on board. In the meantime, just keep at it–maybe we can make a breakthrough before then."

Your heart sinks. You are convinced there is huge value in the connected device data. You feel like you could make a career out of IoT analytics if you could just figure out how to get there. But you are not a quitter.

You decide you will not go down without a fight, you will find a way.