-
Book Overview & Buying
-
Table Of Contents
-
Feedback & Rating
The C++ Programmer's Mindset
By :
The next challenge is to read the free-form text responses that are held in .txt files, ready to be processed. This means we need to define a new implementation of the FileReader interface defined in the previous chapter, and find some way to extract the date and coordinate data from these unstructured texts. Finding strings matching a particular pattern means using regular expressions. This means we have to think quite carefully about the patterns that we might encounter, such as common date patterns and coordinate patterns. As with all regular expressions, this needs to be a methodical process with ample testing.
In this chapter, we will construct the regular expressions needed to extract dates and coordinates from unstructured text and devise a mechanism for separating the text for the individual entries held within a single file. Both of these different aspects will come together to define a new FileReader class, which will finish our file ingestion...