Book Image

Hands-On Machine Learning with Azure

By : Thomas K Abraham, Parashar Shah, Jen Stirrup, Lauri Lehman, Anindita Basak
Book Image

Hands-On Machine Learning with Azure

By: Thomas K Abraham, Parashar Shah, Jen Stirrup, Lauri Lehman, Anindita Basak

Overview of this book

Implementing Machine learning (ML) and Artificial Intelligence (AI) in the cloud had not been possible earlier due to the lack of processing power and storage. However, Azure has created ML and AI services that are easy to implement in the cloud. Hands-On Machine Learning with Azure teaches you how to perform advanced ML projects in the cloud in a cost-effective way. The book begins by covering the benefits of ML and AI in the cloud. You will then explore Microsoft’s Team Data Science Process to establish a repeatable process for successful AI development and implementation. You will also gain an understanding of AI technologies available in Azure and the Cognitive Services APIs to integrate them into bot applications. This book lets you explore prebuilt templates with Azure Machine Learning Studio and build a model using canned algorithms that can be deployed as web services. The book then takes you through a preconfigured series of virtual machines in Azure targeted at AI development scenarios. You will get to grips with the ML Server and its capabilities in SQL and HDInsight. In the concluding chapters, you’ll integrate patterns with other non-AI services in Azure. By the end of this book, you will be fully equipped to implement smart cognitive actions in your models.
Table of Contents (14 chapters)

HDInsight and Spark

Apache Spark is an open source parallel processing framework that supports in-memory processing to boost the performance of big data analytic applications. The Apache Spark cluster on HDInsight is compatible with Azure Storage (WASB), as well as Azure Data Lake Store.

When the developer creates a Spark cluster on HDInsight, the Azure compute resources are already created with Spark installed and configured. It only takes about 10 minutes to create a Spark cluster in HDInsight. The data to be processed is stored in Azure Storage or Azure Data Lake Storage.

Apache Spark provides primitives for in-memory cluster computing, which means that it is the perfect partner for HDInsight. An Apache Spark job can load and cache data into memory and query it repeatedly, which means that it produces results much more quickly than disk-based systems. In addition to this, Apache...