Sign In Start Free Trial
Account

Add to playlist

Create a Playlist

Modal Close icon
You need to login to use this feature.
  • Book Overview & Buying Mastering Data Mining with Python - Find patterns hidden in your data
  • Table Of Contents Toc
Mastering Data Mining with Python - Find patterns hidden in your data

Mastering Data Mining with Python - Find patterns hidden in your data

By : Megan Squire
2.7 (3)
close
close
Mastering Data Mining with Python - Find patterns hidden in your data

Mastering Data Mining with Python - Find patterns hidden in your data

2.7 (3)
By: Megan Squire

Overview of this book

Data mining is an integral part of the data science pipeline. It is the foundation of any successful data-driven strategy – without it, you'll never be able to uncover truly transformative insights. Since data is vital to just about every modern organization, it is worth taking the next step to unlock even greater value and more meaningful understanding. If you already know the fundamentals of data mining with Python, you are now ready to experiment with more interesting, advanced data analytics techniques using Python's easy-to-use interface and extensive range of libraries. In this book, you'll go deeper into many often overlooked areas of data mining, including association rule mining, entity matching, network mining, sentiment analysis, named entity recognition, text summarization, topic modeling, and anomaly detection. For each data mining technique, we'll review the state-of-the-art and current best practices before comparing a wide variety of strategies for solving each problem. We will then implement example solutions using real-world data from the domain of software engineering, and we will spend time learning how to understand and interpret the results we get. By the end of this book, you will have solid experience implementing some of the most interesting and relevant data mining techniques available today, and you will have achieved a greater fluency in the important field of Python data analytics.
Table of Contents (11 chapters)
close
close
10
Index

Summary

In this chapter, we learned what it would take to expand our data mining toolbox to the master level. First we took a long view of the field as a whole, starting with the history of data mining as a piece of the knowledge discovery in databases (KDD) process. We also compared the field of data mining to other similar fields such as data science, machine learning, and big data.

Next, we outlined the common tools and techniques that most experts consider to be most important to the KDD process, paying special attention to the techniques that are used most frequently in the mining and analysis steps. To really master data mining, it is important that we work on problems that are different than simple textbook examples. For this reason, we will be working on more exotic data mining techniques such as generating summaries and finding outliers, and focusing on more unusual data types, such as text and networks.

Finally, in this chapter we put together a robust data mining system for ourselves. Our workspace centers around the powerful, general-purpose programming language, Python, and its many useful data mining packages, such as NLTK, Gensim, Numpy, Networkx, and Scikit-learn, and it is complemented by an easy-to-use and free MySQL database.

Now, all this discussion of software packages has got me thinking: Have you ever wondered what packages are used most frequently together? Is the combination of NLTK and Networkx a common thing to see, or is this a rather unusual pairing of libraries? In the next chapter, we will work on solving exactly that type of problem. In Chapter 2, Association Rule Mining, we will learn how to generate a list of frequently-found pairs, triples, quadruples, and more, and then we will attempt to make predictions based on the patterns we found.

CONTINUE READING
83
Tech Concepts
36
Programming languages
73
Tech Tools
Icon Unlimited access to the largest independent learning library in tech of over 8,000 expert-authored tech books and videos.
Icon Innovative learning tools, including AI book assistants, code context explainers, and text-to-speech.
Icon 50+ new titles added per month and exclusive early access to books as they are being written.
Mastering Data Mining with Python - Find patterns hidden in your data
notes
bookmark Notes and Bookmarks search Search in title playlist Add to playlist font-size Font size

Change the font size

margin-width Margin width

Change margin width

day-mode Day/Sepia/Night Modes

Change background colour

Close icon Search
Country selected

Close icon Your notes and bookmarks

Confirmation

Modal Close icon
claim successful

Buy this book with your credits?

Modal Close icon
Are you sure you want to buy this book with one of your credits?
Close
YES, BUY

Submit Your Feedback

Modal Close icon
Modal Close icon
Modal Close icon