The annotation task
The same annotation platform as used for the competence task can also be repurposed for the annotation task. However, in the competency task, the tweets were presented in sequence, whereas in the annotation task, some clever logic is used. Without going into too much detail, it is a good idea to use a database such as Access or MySQL. This allows questions to be partitioned, answers to be tracked, and the results to be analyzed. This also enables the ability to determine how much of the dataset has been labeled, which tweets have been labeled, and, importantly, the distribution of those answers (we’ll explain why this is important shortly).
A schema describes how data will be organized and connected and includes tables, relationships, and other elements. Figure 3.5 shows a simple sample database schema.
Figure 3.5 – Database schema diagram
The USERS
table contains a record for each person who will be labeling. Note...