Filtering with Boolean arrays
Both Series and DataFrame can be filtered with Boolean arrays. You can index this directly off of the object or off of the .loc
attribute.
This recipe constructs two complex filters for different rows of movies. The first filters movies with an imdb_score
greater than 8, a content_rating
of PG-13, and a title_year
either before 2000 or after 2009. The second filter consists of those with an imdb_score
less than 5, a content_rating
of R, and a title_year
between 2000 and 2010. Finally, we will combine these filters.
How to do it…
- Read in the movie dataset, set the index to
movie_title
, and create the first set of criteria:>>> movie = pd.read_csv( ... "data/movie.csv", index_col="movie_title" ... ) >>> crit_a1 = movie.imdb_score > 8 >>> crit_a2 = movie.content_rating == "PG-13" >>> crit_a3 = (movie.title_year < 2000) | ( ... movie.title_year...