So far, we've only been parsing noun phrases. But RegexpParser
supports grammars with multiple phrase types, such as verb phrases and prepositional phrases. We can put the rules we've learned to use and define a grammar that can be evaluated against the conll2000
corpus, which has NP
, VP
, and PP
phrases.
Now, we will define a grammar to parse three phrase types. For noun phrases, we have a ChunkRule
class that looks for an optional determiner followed by one or more nouns. We then have a MergeRule
class for adding an adjective to the front of a noun chunk. For prepositional phrases, we simply chunk any IN
word, such as in
or on
. For verb phrases, we chunk an optional modal word (such as should
) followed by a verb.