Summary
There is no doubt that finding suitable data can be a challenge, but there are ways and means to mitigate that. For example, there are plenty of repositories with comprehensive search features that allow you to find relevant datasets.
In this chapter, we started by looking at public data sources and went through some of the most popular ones. We saw that many datasets are free, but access to some required a subscription to the repository. Even with the existence of these repositories, there is still sometimes a need to “roll your own” dataset, so we looked at the benefit of doing that and some ways in which we might collect our own data and create our own datasets. We then discussed some niche places to find datasets specific to the emotion analysis problem—for example, from competition websites. Datasets often contain sensitive information about individuals, such as their personal beliefs, behaviors, and mental health status, hence we noted that it...