Book Image

Mastering Machine Learning with R, Second Edition - Second Edition

Book Image

Mastering Machine Learning with R, Second Edition - Second Edition

Overview of this book

This book will teach you advanced techniques in machine learning with the latest code in R 3.3.2. You will delve into statistical learning theory and supervised learning; design efficient algorithms; learn about creating Recommendation Engines; use multi-class classification and deep learning; and more. You will explore, in depth, topics such as data mining, classification, clustering, regression, predictive modeling, anomaly detection, boosted trees with XGBOOST, and more. More than just knowing the outcome, you’ll understand how these concepts work and what they do. With a slow learning curve on topics such as neural networks, you will explore deep learning, and more. By the end of this book, you will be able to perform machine learning with R in the cloud using AWS in various scenarios with different datasets.
Table of Contents (23 chapters)
Title Page
Credits
About the Author
About the Reviewers
Packt Upsell
Customer Feedback
Preface
16
Sources

Chapter 1. A Process for Success

"If you don't know where you are going, any road will get you there."                                                                                                              - Robert Carrol

"If you can't describe what you are doing as a process, you don't know what you're doing."                                                                                                              - W. Edwards Deming

At first glance, this chapter may seem to have nothing to do with machine learning, but it has everything to do with machine learning (specifically, its implementation and making change happen). The smartest people, best software, and best algorithms do not guarantee success, no matter how well it is defined.

In most, if not all, projects, the key to successfully solving problems or improving decision-making is not the algorithm, but the softer, more qualitative skills of communication and influence. The problem many of us have with this is that it is hard to quantify how effective one is around these skills. It is probably safe to say that many of us ended up in this position because of a desire to avoid it. After all, the highly successful TV comedy The Big Bang Theory was built on this premise. Therefore, the goal of this chapter is to set you up for success. The intent is to provide a process, a flexible process no less, where you can become a change agent: a person who can influence and turn their insights into action without positional power. We will focus on Cross-Industry Standard Process for Data Mining (CRISP-DM). It is probably the most well-known and respected of all processes for analytical projects. Even if you use another industry process or something proprietary, there should still be a few gems in this chapter that you can take away.

I will not hesitate to say that this all is easier said than done; without question, I'm guilty of every sin (both commission and omission) that will be discussed in this chapter. With skill and some luck, you can avoid the many physical and emotional scars I've picked up over the last 12 years.

Finally, we will also have a look at a flow chart (a cheat sheet) that you can use to help you identify what methodologies to apply to the problem at hand.