Book Image

Practical Machine Learning

By : Sunila Gollapudi
Book Image

Practical Machine Learning

By: Sunila Gollapudi

Overview of this book

This book explores an extensive range of machine learning techniques uncovering hidden tricks and tips for several types of data using practical and real-world examples. While machine learning can be highly theoretical, this book offers a refreshing hands-on approach without losing sight of the underlying principles. Inside, a full exploration of the various algorithms gives you high-quality guidance so you can begin to see just how effective machine learning is at tackling contemporary challenges of big data This is the only book you need to implement a whole suite of open source tools, frameworks, and languages in machine learning. We will cover the leading data science languages, Python and R, and the underrated but powerful Julia, as well as a range of other big data platforms including Spark, Hadoop, and Mahout. Practical Machine Learning is an essential resource for the modern data scientists who want to get to grips with its real-world application. With this book, you will not only learn the fundamentals of machine learning but dive deep into the complexities of real world data before moving on to using Hadoop and its wider ecosystem of tools to process and manage your structured and unstructured data. You will explore different machine learning techniques for both supervised and unsupervised learning; from decision trees to Naïve Bayes classifiers and linear and clustering methods, you will learn strategies for a truly advanced approach to the statistical analysis of data. The book also explores the cutting-edge advancements in machine learning, with worked examples and guidance on deep learning and reinforcement learning, providing you with practical demonstrations and samples that help take the theory–and mystery–out of even the most advanced machine learning methodologies.
Table of Contents (23 chapters)
Practical Machine Learning
Credits
Foreword
About the Author
Acknowledgments
About the Reviewers
www.PacktPub.com
Preface
Index

Some complementing fields of Machine learning


Machine learning has a close relationship to many related fields including artificial intelligence, data mining, statistics, data science, and others listed shortly. In fact, Machine learning is in that way a multi-disciplinary field, and in some ways is linked to all of these fields.

In this section, we will define some of these fields, draw parallels to how they correlate to Machine learning, and understand the similarities and dissimilarities, if any. Overall, we will start with the core Machine learning definition as a field of science that includes developing self-learning algorithms. Most of the fields we are going to discuss now either use machine learning techniques or a superset or subset of machine learning techniques.

Data mining

Data mining is a process of analyzing data and deriving insights from a (large) dataset by applying business rules to it. The focus here is on the data and the domain of the data. Machine learning techniques are adopted in the process of identifying which rules are relevant and which aren't.

Machine learning versus Data mining

Similarities with Machine learning

Dissimilarities with Machine learning

Relationship with Machine learning

Both Machine learning and data mining look at data with the goal of extracting value from it.

Most of the tools used for Machine learning and data mining are common. For example, R and Weka among others.

While Machine learning focuses on using known knowledge or experience, data mining focuses on discovering unknown knowledge, like the existence of a specific structure in data that will be of help in analyzing the data.

Intelligence derived is meant to be consumed by machines in Machine learning compared to data mining where the target consumers are humans.

The fields of Machine learning and data mining are intertwined, and there is a significant overlap in the underlying principles and methodologies.

Artificial intelligence (AI)

Artificial intelligence focuses on building systems that can mimic human behavior. It has been around for a while now and the modern AI has been continuously evolving, now includes specialized data requirements. Among many other capabilities, AI should demonstrate the following:

  • Knowledge storage and representation to hold all the data that is subject to interrogation and investigation

  • Natural Language Processing (NLP) capabilities to be able to process text

  • Reasoning capabilities to be able to answer questions and facilitate conclusions

  • The ability to plan, schedule, and automate

  • Machine learning to be able to build self-learning algorithms

  • Robotics and more

Machine learning is a subfield of artificial intelligence.

Machine learning versus Artificial Intelligence

Similarities with Machine learning

Dissimilarities with Machine learning

Relationship with Machine learning

Both machine learning and artificial intelligence employ learning algorithms and focus on automation when reasoning or decision-making.

Though Machine learning is considered to be in the AI's range of interests, Machine learning's primary focus is to improve on a machine's performance of a task, and the experience built need not always be human behavior. In the case of artificial intelligence, human inspired algorithms are employed.

Machine learning is often considered as a subfield of artificial intelligence.

Statistical learning

In statistical learning, the predictive functions are arrived at and primarily derived from samples of data. It is of great importance how the data is collected, cleansed, and managed in this process. Statistics is pretty close to mathematics, as it is about quantifying data and operating on numbers.

Machine learning versus Statistical learning

Similarities with Machine learning

Dissimilarities with Machine learning

Relationship with Machine learning

Just like Machine learning, statistical learning is also about building the ability to infer from the data that in some cases represents experience.

Statistical learning focuses on coming up with valid conclusions while Machine learning is about predictions. Statistical learning works on and allows assumptions as against Machine learning. Machine learning and statistics are practiced by different groups. Machine learning is a relatively new field when compared to statistics.

The Machine learning technology implements statistical techniques.

Data science

Data science is all about turning data into products. It is analytics and machine learning put into action to draw inferences and insights out of data. Data science is perceived to be a first step from traditional data analysis and knowledge systems, such as Data Warehouses (DW) and Business Intelligence (BI), which considers all aspects of big data.

The data science lifecycle includes steps from data availability/loading to deriving and communicating data insights up to operationalizing the process, and Machine learning often forms a subset of this process.

Machine learning versus Data science

Similarities with Machine learning

Dissimilarities with Machine learning

Relationship with Machine learning

Machine learning and data science have prediction as a common binding outcome given the problem's context.

One of the important differences between Machine learning and data science is the need for domain expertise. Data science focuses on solving domain-specific problems, while Machine learning focuses on building models that can generically fit a problem context.

Data science is a superset of Machine learning, data mining, and related subjects. It extensively covers the complete process starting from data loading until production.