Welcome to this presentation on the workhorse of data science, linear regression, and its related family of linear models.
Nowadays, interconnectivity and data explosion are realities that open a world of new opportunities for every business that can read and interpret data in real time. Everything is facilitating the production and diffusion of data: the omnipresent Internet diffused both at home and at work, an army of electronic devices in the pockets of large portions of the population, and the pervasive presence of software producing data about every process and event. So much data is generated daily that humans cannot deal with it because of its volume, velocity, and variety. Thus, machine learning and AI are on the rise.
Coming from a long and glorious past in the field of statistics and econometrics, linear regression, and its derived methods, can provide you with a simple, reliable, and effective tool to learn from data and act on it. If carefully trained with the right data, linear methods can compete well against the most complex and fresh AI technologies, offering you unbeatable ease of implementation and scalability for increasingly large problems.
In this chapter, we will explain:
Why linear models can be helpful as models to be evaluated in a data science pipeline or as a shortcut for the immediate development of a scalable minimum viable product
Some quick indications for installing Python and setting it up for data science tasks
The necessary modules for implementing linear models in Python