Row Sampling Transformation in SSIS


Introduction:

In this article we are going to see how to use a Row Sampling transformation in SSIS Packaging. Row sampling is used to randomly select some rows and move them as output to the required process as and when required to divide the rows. An example of using this transformation is say example if we want to select some randomly 10 users of a community for a random prize then we can use this transformation. In this process we are going to see an example of how to use this process for the unselected output values.  Let's jump start to see this sample of how to set the properties of the control.

Steps:

Follow steps 1 to 3 in my first article to open the BIDS project and select the right project to work on an integration services project. Once the project is created, we will see how to use the Percentage sampling to see the flow. Now once the project has been opened, drag and drop a source and a Percentage sampling task as shown in the screen below:



Now we can see the red mark on the control which indicates that the controls are yet to be configured. Now let us configure the source (refer to my previous articles on how to configure OLEDB source). Now your screen will look like below:




Now we need to configure the Row Sampling task; double-click the task to open the screen as below:




Here we can specify the number of random rows to be selected and the name for the Sample output selected and unselected. Since in our example we are going to see the Selected and Unselected Output we will see how to use it. Now after specifying the number of rows just click on the OK button. Now drag and drop the Flat File Destination and an Excel destination as shown in the screen below:




Now drag the green line from the Row Sampling to the Flat File destination and the Excel destination task. It will open a window as shown in the screens below:




Here we have selected the Selected output for Excel destination and Unselected output for Flat file destination as the Output to get the data as we can see in the screen above. Now click on OK to do the further configuration of the destinations as shown in the screens below:






Once the configuration is done we can see the screen as below:



Now our package is ready to execute. Press F5 and execute the project; we can see the screen as below which indicates that the package is executed successfully:




We can see out of 91 rows 10 moved randomly to one destination and 81 moved randomly to one destination as we selected the selected rows and the unselected rows as per our requirement.


You can see the result as shown in the screen below.

Unselected Output:


Selected Output:


Conclusion:


So in this article we have seen how to use the Row Sampling (Selected and Unselected Output) to execute dataset and split based on the number of rows and use it across the requirement.


Similar Articles