In this example, we will create a Knowledge Base for movies with two domains: director and title of the movie. Then we will create a matching policy based on these two domains to find duplicate records in an Excel file that contains movie information. Perform the following steps to create a matching policy:
In the DQ client, create a new Knowledge Base and name it
PacktPub_Movies_KB
. Create two domains in this Knowledge Base:Director
andTitle
. Both of them will have their string types as the default configuration. Click on Finish and publish the Knowledge Base.Now, in the DQ client, under recent Knowledge Bases, click on PacktPub_Movies_KB, and from the pop-up menu, choose Matching Policy.
In the Map step, set the source file type as Excel, choose
MoviesSampleData.xlsx
, and check Use first row as header. Map Title and Director from the input columns to the Knowledge Base domains and click on Next.In the Matching Policy tab, create a matching rule. Name it...