In this chapter, we'll first discuss sources of open data, which includes the University of California at Irvine (UCI) Machine Learning Depository, the Bureau of Labor Statistics, the Census Bureau, Professor French's Data Library, and the Federal Reserve's Data Library. Then, we will show you several ways of inputting data, how to deal with missing values, sorting, choosing a subset, merging different datasets, and data output. For different languages, such as Python, R, and Julia, several relevant packages for data manipulation will be introduced as well. In particular, the Python pandas package will be discussed.
In this chapter, the following topics will be covered:
- Sources of data
- Introduction to the Python pandas package
- Several ways to inputting packages
- Introduction to the Quandl data delivery platform
- Dealing with missing data
- Sorting data...