Now that we've seen how to build a few different types of NERs, we can look at how to combine them. In this recipe, we will take a regular expression chunker, a dictionary-based chunker, and an HMM-based chunker and combine their outputs and look at overlaps.
We will just initialize a few chunkers in the same way we did in the past few recipes and then pass the same text through these chunkers. The easiest possibility is that each chunker returns a unique output. For example, let's consider a sentence such as "President Obama was scheduled to give a speech at the G-8 conference this evening". If we have a person chunker and an organization chunker, we might only get two unique chunks out. However, if we add a Presidents of USA
chunker, we will get three chunks: PERSON
, ORGANIZATION
, and PRESIDENT
. This very simple recipe will show us one way to handle these cases.