Sign In Start Free Trial
Account

Add to playlist

Create a Playlist

Modal Close icon
You need to login to use this feature.
  • Book Overview & Buying Principles of Data Science
  • Table Of Contents Toc
Principles of Data Science

Principles of Data Science - Second Edition

By : Sinan Ozdemir, Kakade, Tibaldeschi
close
close
Principles of Data Science

Principles of Data Science

By: Sinan Ozdemir, Kakade, Tibaldeschi

Overview of this book

Need to turn programming skills into effective data science skills? This book helps you connect mathematics, programming, and business analysis. You’ll feel confident asking—and answering—complex, sophisticated questions of your data, making abstract and raw statistics into actionable ideas. Going through the data science pipeline, you'll clean and prepare data and learn effective data mining strategies and techniques to gain a comprehensive view of how the data science puzzle fits together. You’ll learn fundamentals of computational mathematics and statistics and pseudo-code used by data scientists and analysts. You’ll learn machine learning, discovering statistical models that help control and navigate even the densest datasets, and learn powerful visualizations that communicate what your data means.
Table of Contents (17 chapters)
close
close
16
Index

Summary

At the beginning of this chapter, I posed a simple question: what's the catch of data science? Well, there is one. It isn't all fun, games, and modeling. There must be a price for our quest to create ever-smarter machines and algorithms. As we seek new and innovative ways to discover data trends, a beast lurks in the shadows. I'm not talking about the learning curve of mathematics or programming, nor am I referring to the surplus of data. The Industrial Age left us with an ongoing battle against pollution. The subsequent Information Age left behind a trail of big data. So, what dangers might the Data Age bring us?

The Data Age can lead to something much more sinister — the dehumanization of the individual through mass data.

More and more people are jumping head-first into the field of data science, most with no prior experience of math or CS, which, on the surface, is great. Average data scientists have access to millions of dating profiles' data, tweets, online reviews, and much more in order to jump start their education.

However, if you jump into data science without the proper exposure to theory or coding practices, and without respect for the domain you are working in, you face the risk of oversimplifying the very phenomenon you are trying to model.

For example, let's say you want to automate your sales pipeline by building a simplistic program that looks at LinkedIn for very specific keywords in a person's LinkedIn profile. You could use the following code to do this:

keywords = ["Saas", "Sales", "Enterprise"] 

Great. Now you can scan LinkedIn quickly to find people who match your criteria. But what about that person who spells out "Software as a Service", instead of "SaaS," or misspells "enterprise" (it happens to the best of us; I bet someone will find a typo in my book). How will your model figure out that these people are also a good match? They should not be left behind just because the cut-corners data scientist has overgeneralized people in such an easy way.

The programmer chose to simplify their search for another human by looking for three basic keywords and ended up with a lot of missed opportunities left on the table.

In the next chapter, we will explore the different types of data that exist in the world, ranging from free-form text to highly structured row/column files. We will also look at the mathematical operations that are allowed for different types of data, as well as deduce insights based on the form of the data that comes in.

CONTINUE READING
83
Tech Concepts
36
Programming languages
73
Tech Tools
Icon Unlimited access to the largest independent learning library in tech of over 8,000 expert-authored tech books and videos.
Icon Innovative learning tools, including AI book assistants, code context explainers, and text-to-speech.
Icon 50+ new titles added per month and exclusive early access to books as they are being written.
Principles of Data Science
notes
bookmark Notes and Bookmarks search Search in title playlist Add to playlist download Download options font-size Font size

Change the font size

margin-width Margin width

Change margin width

day-mode Day/Sepia/Night Modes

Change background colour

Close icon Search
Country selected

Close icon Your notes and bookmarks

Confirmation

Modal Close icon
claim successful

Buy this book with your credits?

Modal Close icon
Are you sure you want to buy this book with one of your credits?
Close
YES, BUY

Submit Your Feedback

Modal Close icon
Modal Close icon
Modal Close icon