Book Image

Python Machine Learning, Second Edition - Second Edition

By : Sebastian Raschka, Vahid Mirjalili
Book Image

Python Machine Learning, Second Edition - Second Edition

By: Sebastian Raschka, Vahid Mirjalili

Overview of this book

Publisher's Note: This edition from 2017 is outdated and is not compatible with TensorFlow 2 or any of the most recent updates to Python libraries. A new third edition, updated for 2020 and featuring TensorFlow 2 and the latest in scikit-learn, reinforcement learning, and GANs, has now been published. Machine learning is eating the software world, and now deep learning is extending machine learning. Understand and work at the cutting edge of machine learning, neural networks, and deep learning with this second edition of Sebastian Raschka’s bestselling book, Python Machine Learning. Using Python's open source libraries, this book offers the practical knowledge and techniques you need to create and contribute to machine learning, deep learning, and modern data analysis. Fully extended and modernized, Python Machine Learning Second Edition now includes the popular TensorFlow 1.x deep learning library. The scikit-learn code has also been fully updated to v0.18.1 to include improvements and additions to this versatile machine learning library. Sebastian Raschka and Vahid Mirjalili’s unique insight and expertise introduce you to machine learning and deep learning algorithms from scratch, and show you how to apply them to practical industry challenges using realistic and interesting examples. By the end of the book, you’ll be ready to meet the new data analysis opportunities. If you’ve read the first edition of this book, you’ll be delighted to find a balance of classical ideas and modern insights into machine learning. Every chapter has been critically updated, and there are new chapters on key technologies. You’ll be able to learn and work with TensorFlow 1.x more deeply than ever before, and get essential coverage of the Keras neural network library, along with updates to scikit-learn 0.18.1.
Table of Contents (24 chapters)
Python Machine Learning Second Edition
Credits
About the Authors
About the Reviewers
www.PacktPub.com
Packt is Searching for Authors Like You
Preface
Index

Preparing the IMDb movie review data for text processing


Sentiment analysis, sometimes also called opinion mining, is a popular subdiscipline of the broader field of NLP; it is concerned with analyzing the polarity of documents. A popular task in sentiment analysis is the classification of documents based on the expressed opinions or emotions of the authors with regard to a particular topic.

In this chapter, we will be working with a large dataset of movie reviews from the IMDb that has been collected by Maas and others (Learning Word Vectors for Sentiment Analysis, A. L. Maas, R. E. Daly, P. T. Pham, D. Huang, A. Y. Ng, and C. Potts, Proceedings of the 49th Annual Meeting of the Association for Computational Linguistics: Human Language Technologies, pages 142–150, Portland, Oregon, USA, Association for Computational Linguistics, June 2011). The movie review dataset consists of 50,000 polar movie reviews that are labeled as either positive or negative; here, positive means that a movie was...