Book Image

Designing Production-Grade and Large-Scale IoT Solutions

By : Mohamed Abdelaziz
Book Image

Designing Production-Grade and Large-Scale IoT Solutions

By: Mohamed Abdelaziz

Overview of this book

With the rising demand for and recent enhancements in IoT, a developer with sound knowledge of IoT is the need of the hour. This book will help you design, build, and operate large-scale E2E IoT solutions to transform your business and products, increase revenue, and reduce operational costs. Starting with an overview of how IoT technologies can help you solve your business problems, this book will be a useful guide to helping you implement end-to-end IoT solution architecture. You'll learn to select IoT devices; real-time operating systems; IoT Edge covering Edge location, software, and hardware; and the best IoT connectivity for your IoT solution. As you progress, you'll work with IoT device management, IoT data analytics, IoT platforms, and put these components to work as part of your IoT solution. You'll also be able to build IoT backend cloud from scratch by leveraging the modern app architecture paradigms and cloud-native technologies such as containers and microservices. Finally, you'll discover best practices for different operational excellence pillars, including high availability, resiliency, reliability, security, cost optimization, and high performance, which should be applied for large-scale production-grade IoT solutions. By the end of this IoT book, you'll be confident in designing, building, and operating IoT solutions.
Table of Contents (15 chapters)
Section 1: Anatomy of IoT
Section 2: The IoT Backend (aka the IoT Cloud)
Section 3: IoT Application Architecture Paradigms and IoT Operational Excellence

IoT solution design patterns

In software coding practices, you should have heard about software design patterns, in particular, those patterns introduced by the famous Gang of Four (GOF) book Design Patterns: Elements of Reusable Object-Oriented Software.

The idea of design patterns, in general in any domain, is to provide a solution that has been tested and proven by different experts to solve specific repeated problems or challenges in the subject domain. For example, in the software domain, you will see a list of design patterns that solve some common coding challenges, such as the Factory pattern, the Singleton pattern, and so many other patterns. In the enterprise integration domain, you will see patterns such as Publish-Subscribe, Event-Driven, Content-Based Router and Message Filter. In the cloud, you will see patterns such as Ambassador, Anti-Corruption Layer, Cache-Aside, and Responsibility Segregation (Command Query Responsibility Segregation (CQRS)).

In the IoT domain, we do have some design patterns and design principles that are commonly used in different IoT solutions. Let's look at some of these.


This is the most famous and common pattern used in IoT solutions. IoT devices sense the physical world and send related data to the IoT Cloud for further processing. Data sent from devices is called data telematics or telemetry.

To send the data to the IoT Cloud, you need connectivity and a communication protocol. Different protocols may be employed, including HTTP(S), WebSocket, and MQTT.

Since we are talking about telemetry, MQTT is the best and most common protocol used in sending IoT device telematics to the IoT Cloud or, to be more specific, to the IoT Cloud Message Broker or Edge Device gateway running the MQTT server.

This telemetry pattern helps many companies to build and offer IoT SaaS solutions or ready-made solutions, where enterprises buy ready-made IoT solutions to have their products connected very quickly and start to gain useful insights from them. The idea is simple: connect the device to an IoT platform and start sending telematics and the IoT platform will provide dashboard, monitoring, alerting, reporting, and analytics features out of the box based on the telematics data collected. Microsoft Azure IoT Central is one of those IoT application platforms that offer such ready-made IoT solutions with no great development skills required.


This pattern is also very common in IoT solutions and usually comes side by side with the telemetry pattern. The command or action pattern means applying or running different commands to a remote IoT device – commands such as reboot, reset, switch on, and switch off device actions, and even upgrading the device firmware, is also considered a command pattern.

It is usually best practice to have control and management of the remote IoT device for monitoring, diagnostics, and security.

Outbound connectivity only

IoT devices are not like traditional web or proxy server devices where you can open some inbound ports such as 80 or 443 for incoming traffic from the internet or from a private network. Opening such inbound ports for traditional powerful devices could be acceptable, as usually there are firewalls, Distributed Denial-of-Service (DDoS) protections, and Web Application Firewall (WAF) solutions in place to protect such proxy web servers or the whole server farm from such attacks.

In the IoT, it is slightly different. IoT endpoint devices usually, or in most cases, do not run an IP stack, meaning there's no IP assigned to devices. They do not run an IP stack as they are low-power, constrained, and low-value devices. Even in the case of powerful IoT endpoint devices running an IP stack, it is recommended to not open its inbound ports to mitigate security attacks. Security breaches in IoT solutions are more dangerous than a breach in e-commerce applications or any other IT solutions as the impact of an IoT security breach will be severe on systems and humans as well.

So, the most common IoT device communication mode is the outbound communication mode (aka device to cloud), where the device will make an outbound call to the external network instead of receiving inbound calls (aka cloud to device) from external networks.

You might wonder how the command IoT design pattern we mentioned above works then? As explained before, a command means you send instructions to an IoT device to do some actions. Hence, this means inbound traffic from your application/system toward an IoT device. So how come we say IoT device connectivity is usually, or recommended to be, outbound?

This is a very good question and the answer is simple. The IoT device makes an outbound call to ask about any actions or instructions targeted for it that need to be executed. As we will learn through the book, IoT devices might be in a deep sleep mode most of the time and when they wake up, they communicate with the IoT Cloud or the IoT Edge; Hey, I am up now. Any jobs for me? Then, the IoT device will subscribe to a specific topic for such jobs and retrieve instructions and execute them.

Refer to the next pattern as this is related to that point as well.

Device twin or device shadow

For sure, connectivity will be dropped or disconnected at some point in time, and you should always consider that fact when designing an IoT solution. IoT devices might be deployed in very far away locations where there is no network coverage at all, where there is weak network coverage, or where intermittent connectivity occurs due to one of many reasons, such as electronic or radio interference from other surrounding devices or even the IoT device being in deep sleep mode to save the battery, meaning it is not connected all the time.

The solution to that intermittent connectivity challenge is to use an intermediate system or buffer to hold and manage the communication between the IoT devices from one end and the IoT application from the other end. In other words, IoT applications should not talk directly to IoT devices and vice versa. Rather, IoT applications should talk to the intermediate system and instruct it on what needs to be done on the IoT devices, and then that intermediate system will talk to the IoT devices if the IoT devices are on – the best-case scenario. Otherwise, if they are off, then that intermediate system will hold the message until the IoT devices come back to life, wake up, or connect again. Then, they will get the message buffered in the intermediate system when they were off and act accordingly.

Each IoT platform implements the above concept of the intermediate system. AWS calls it the Device Shadow service, Microsoft Azure calls it Device Twin, and so on.

Device bootstrapping or device provisioning

When you buy smart consumer devices, there is usually a common step to configure some device settings before the device works, settings such as the device access point or device connectivity, which Wi-Fi the device should use to connect to the internet in the case of Wi-Fi-enabled devices, which Access Point Name (APN) should be used in the case of cellular LTE connectivity, and which Bluetooth peer should be paired with. This process is called device setup, device provisioning, or device bootstrapping.

In industrial IoT (IIoT), provisioning or bootstrapping IoT devices is different and challengeable. In IIoT, we are talking about a massive number of IoT devices, so how can we have a device bootstrap on such a scale?

To give an example, think about a connected product company that produces thousands or even millions of such connected product devices. They sell devices across the globe. Now, you need a way to streamline the bootstrapping process for all of those IoT devices. If you choose cellular connectivity in your connected products, then how will you bootstrap or provision the SIM inside the device in different countries with different mobile network operators? It is a bit tricky and not a straightforward process.

Usually, two topics emerge in the provisioning and bootstrapping process.

Device identity

As a manufacturer or owner of IoT devices, how can I make sure that only my IoT devices – not any other IoT or non-IoT devices – talk to my IoT Cloud or my customer's IoT Cloud? There's a need to have some sort of attestation or credentials that are stored securely in IoT devices in order to connect to the IoT Cloud. The device will send such credentials to the IoT Cloud during the connectivity handshake, which can then be validated and authenticated by the IoT Cloud.

IoT device credentials could be in the following forms:

  • A password or token
  • An X.509 certificate
  • A Trusted Platform Module (TPM)

There are pros and cons of each option, which we will cover in detail later in the book. But for now, and regarding bootstrapping, the question is how those credentials are stored and used by the IoT device.

To answer that question, a myriad of options are available, including the following:

  • During the IoT device manufacturing process
  • Over the air, that is, pushed to devices after manufacturing
  • Just in Time (JIT) provisioning, that is, when the device connects for the first time to the IoT Cloud
  • Dependent on a third-party solution, that is, a telecommunication network backbone

We will cover these options later in the book.

Device home endpoints

Where should the IoT device connect to send the data telematics? Will such an endpoint be statically configured in the device or will it be pushed to the device over the air? Is there a global point an IoT device can and must connect to when it connects for the first time, and then that global point will or might direct the IoT device to the endpoint the IoT device should connect to, for example, a specific region or country's IoT hub or IoT message broker?

All those questions should be answered and clarified as part of the device bootstrapping process.

Now that we have covered IoT solution design patterns in detail, it's time to discuss some use cases to broaden our horizons.