Book Image

Practical Machine Learning

By : Sunila Gollapudi
Book Image

Practical Machine Learning

By: Sunila Gollapudi

Overview of this book

This book explores an extensive range of machine learning techniques uncovering hidden tricks and tips for several types of data using practical and real-world examples. While machine learning can be highly theoretical, this book offers a refreshing hands-on approach without losing sight of the underlying principles. Inside, a full exploration of the various algorithms gives you high-quality guidance so you can begin to see just how effective machine learning is at tackling contemporary challenges of big data This is the only book you need to implement a whole suite of open source tools, frameworks, and languages in machine learning. We will cover the leading data science languages, Python and R, and the underrated but powerful Julia, as well as a range of other big data platforms including Spark, Hadoop, and Mahout. Practical Machine Learning is an essential resource for the modern data scientists who want to get to grips with its real-world application. With this book, you will not only learn the fundamentals of machine learning but dive deep into the complexities of real world data before moving on to using Hadoop and its wider ecosystem of tools to process and manage your structured and unstructured data. You will explore different machine learning techniques for both supervised and unsupervised learning; from decision trees to Naïve Bayes classifiers and linear and clustering methods, you will learn strategies for a truly advanced approach to the statistical analysis of data. The book also explores the cutting-edge advancements in machine learning, with worked examples and guidance on deep learning and reinforcement learning, providing you with practical demonstrations and samples that help take the theory–and mystery–out of even the most advanced machine learning methodologies.
Table of Contents (23 chapters)
Practical Machine Learning
Credits
Foreword
About the Author
Acknowledgments
About the Reviewers
www.PacktPub.com
Preface
Index

Machine learning tools and frameworks


Machine learning adoption is rapidly increasing among technology and business organizations. Every organization is actively strategizing on how to capitalize on their data and use it to augment their client's experiences and build new businesses. When it comes to tools or frameworks for Machine learning, there are many open source and commercial options on the market. The new age tools are all built to support big data, distributed storage, and parallel processing. In the next chapter, we will cover some aspects of handling large scale data in the context of Machine learning.

At a very high level, there are three generations of Machine learning tools.

The first generation of Machine learning tools is focused on providing a richness of the Machine learning algorithms and supporting deep analytics. These tools haven't been built to focus on handling large scale data or for supporting distributed storage and parallel processing. Some of them still handle volumes as a result of their support for vertical scalability. Some of the tools that come under this category are SAS, SPSS, Weka, R, and more. Having said that, most of these tools are now being upgraded to support big data requirements too.

The second generation tools are focused on supporting big data requirements, most of them work on the Hadoop platform, and they provide capabilities to run machine learning algorithms in a MapReduce paradigm. Some of the tools that are categorized here are Mahout, RapidMiner, Pentaho, and MADlib. Some of these tools do not support all the machine learning algorithms.

The third generations tools are the smart kids on the road, breaking the traditional norms of operating in batch mode, supporting real-time analytics, providing support for advanced data types of big data, and at the same time supporting deeper analytics. Some of the tools that are categorized under this are Spark, HaLoop, and Pregel.

In Chapter 4, Machine Learning Tools, Libraries, and Frameworks, we will cover some of the key machine learning tools and demonstrate how they can be used based on the problem's context. Implementation details for tools such as R, Julia, Python, Mahout, and Spark will be covered in depth. Required technology primers and installation or setup-related guidance will be provided.