In this chapter, we discussed how to join dataframes, how to determine the data we will lose for each type of join using set operations, and how to query dataframes as we would a database. We then went over some more involved transformations on our columns, such as binning and ranking, and how to do so efficiently with the
apply() method. We also learned the importance of vectorized operations in writing efficient
pandas code. Then, we explored window calculations and using pipes for cleaner code. Our discussion of window calculations served as a primer for aggregating across whole dataframes and by groups. We also went over how to generate pivot tables and crosstabs. Finally, we looked at some time series-specific functionality in
pandas for everything from selection and aggregation to merging.
In the next chapter, we will cover visualization, which
pandas implements by providing a wrapper around
matplotlib. Data wrangling will play a key role in prepping our data for visualization...