Book Image

Practical Data Analysis - Second Edition

By : Hector Cuesta, Dr. Sampath Kumar
Book Image

Practical Data Analysis - Second Edition

By: Hector Cuesta, Dr. Sampath Kumar

Overview of this book

Beyond buzzwords like Big Data or Data Science, there are a great opportunities to innovate in many businesses using data analysis to get data-driven products. Data analysis involves asking many questions about data in order to discover insights and generate value for a product or a service. This book explains the basic data algorithms without the theoretical jargon, and you’ll get hands-on turning data into insights using machine learning techniques. We will perform data-driven innovation processing for several types of data such as text, Images, social network graphs, documents, and time series, showing you how to implement large data processing with MongoDB and Apache Spark.
Table of Contents (21 chapters)
Practical Data Analysis - Second Edition
Credits
About the Authors
About the Reviewers
www.PacktPub.com
Preface

Lineal regression


If we want to predict a quantitative value, regression is a great tool due to it uses. It's an independent variable to explain the behavior of a phenomenon such as temperature, asset prices, house prices, and so on. Linear regression finds the best fitting in a straight line.

We use regression or forecast all the time in our daily lives: when we calculate the gas or the time required for a car trip based on previous data (distance, traffic, weather, and so on). In its simplest form, you can think of it in this way: first, get previous data from the phenomena, for example, how much time was spent on previous trips and what was the distance. Then, look at the values form, and try to find a metric to forecast the next value.

In this section, we will program a very simple example of linear regression using scikit-learn, which is a machine-learning library for Python. For this concrete example, we will use the Boston Housing dataset, which represents the data of 506 neighborhoods...