Using SSIS fuzzy components
SSIS includes two really sophisticated matching transformations in the data flow. The fuzzy lookup transformation is used for mapping the identities. The fuzzy grouping transformation is used for de-duplicating. Both of them use the same algorithm for comparing the strings and other data.
Identity mapping and de-duplication are actually the same problem. For example, instead for mapping the identities of entities in two tables, you can union all of the data in a single table and then do the de-duplication. Or vice versa, you can join a table to itself and then do identity mapping instead of de-duplication. This recipe shows how to use the fuzzy lookup transformation for identity mapping.
Getting ready
This recipe assumes that you have successfully finished the previous recipe.
How to do it...
- In SSMS, create a new table in the
DQS_STAGING_DATA
database in thedbo
schema and name itdbo.FuzzyMatchingResults
. Use the following code:
CREATE TABLE dbo.FuzzyMatchingResults...