Book Image

Practical Machine Learning Cookbook

By : Atul Tripathi
Book Image

Practical Machine Learning Cookbook

By: Atul Tripathi

Overview of this book

Machine learning has become the new black. The challenge in today’s world is the explosion of data from existing legacy data and incoming new structured and unstructured data. The complexity of discovering, understanding, performing analysis, and predicting outcomes on the data using machine learning algorithms is a challenge. This cookbook will help solve everyday challenges you face as a data scientist. The application of various data science techniques and on multiple data sets based on real-world challenges you face will help you appreciate a variety of techniques used in various situations. The first half of the book provides recipes on fairly complex machine-learning systems, where you’ll learn to explore new areas of applications of machine learning and improve its efficiency. That includes recipes on classifications, neural networks, unsupervised and supervised learning, deep learning, reinforcement learning, and more. The second half of the book focuses on three different machine learning case studies, all based on real-world data, and offers solutions and solves specific machine-learning issues in each one.
Table of Contents (21 chapters)
Practical Machine Learning Cookbook
Credits
About the Author
About the Reviewer
www.PacktPub.com
Customer Feedback
Preface
14
Case Study - Forecast of Electricity Consumption

Poisson regression - understanding species present in Galapagos Islands


The Galapagos Islands are situated in the Pacific Ocean about 1000 km from the Ecuadorian coast. The archipelago consists of 13 islands, five of which are inhabited. The islands are rich in flora and fauna. Scientists are still perplexed by the fact that such a diverse set of species can flourish in such a small and remote group of islands.

Getting ready

In order to complete this recipe we shall be using species dataset. The first step is collecting the data.

Step 1 - collecting and describing the data

We will utilize the number of species dataset titled gala that is available at https://github.com/burakbayramli/kod/blob/master/books/Practical_Regression_Anove_Using_R_Faraway/gala.txt .

The dataset includes 30 cases and seven variables in the dataset. The seven numeric measurements include the following:

  • Species
  • Endemics
  • Area
  • Elevation
  • Nearest
  • Scruz
  • Adjcacent

How to do it...

Let's get into the details.

Step 2 - exploring the data...