Reader Level:
ARTICLE

SQL Server Integration Services (SSIS) - Fuzzy Lookup Transformation in SSIS

Posted by Karthikeyan Anbarasan Articles | SQL Server April 24, 2011
In this article we are going to see how to use the Fuzzy Lookup transformation in SSIS. This is part 46 of the series of articles on SSIS
  • 0
  • 0
  • 7776

Introduction:


In this article we are going to see how to use the Fuzzy Lookup transformation in SSIS. The Fuzzy lookup transformation uses an equi join to do a check for the matching records across the tables.  Fuzzy lookup can be used where we have a large number of corrupted data and we need to consider doing a cleanup and processing the data to be available across the systems.
Take for example when we need to write a package which fetches the details from the customer table and processes the data to some systems; in that case if there is some mismatch in the name then we also need to process the data; in that situation we can have this fuzzy lookup which takes the matchup as per the threshold and processes the missing records so that the accuracy becomes relevant. Let's jump start to how to use this task in real time and see the steps to do the configurations.
You can look into my series of article on SSIS at the url - http://f5debug.net/all-articles/

Steps:


Follow steps 1 to 3 on my first article to open the BIDS project and select the right project to work on an integration services project. Once the project is created, we will see how to use the Fuzzy Lookup control. Once you open the project just drag and drop the Fuzzy Lookup control and a source provider as shown in the below image.
SSISFuzLook1.jpg

There are some Red Cross icons on the tasks which indicate that the controls are not configured yet. Now let's start to configure the controls in the coming sections. First configure the Source provider as shown in the below task.
SSISFuzLook2.jpg

Now the Source provider is configured, which means we have the data to process in our package; here we need to see the corrupted data that is like any data repeated and anything against the policy for the business. Now let's configure the Fuzzy Lookup as shown in the below screen.
Configure for each tabs as shown below:
SSISFuzLook3.jpg

Here we have an option to create a new index or use an existing index, normally Fuzzy lookup creates an index to do the check for the sorting and do the transformation for checking the duplication of values accordingly. If we have an existing index on the table then we have option to use the same instead of creating a new one to maintain the performance of the table.
SSISFuzLook4.jpg

The above image shows on which column we should map and which column holds the responsibility of doing the column check.
SSISFuzLook5.jpg

The above screen shows the advanced setting to use for the fuzzy lookup transformation like providing the threshold and giving the exact match for the fuzzy transformation.
After finishing the configuration your screen looks like below image:
SSISFuzLook6.jpg

When executing the package (Press F5) your screen looks like below. This indicates that the package is executed perfectly.
SSISFuzLook7.jpg

Conclusion:


So in this article we have seen how to use the Fuzzy Lookup transformation task and the key configurations used in order to use this task handy.

COMMENT USING

Trending up