Listing Unique Records Within an Array in Azure Data Factory

Problem Statement

Is it possible to remove the duplicate within an Array in Azure Data Factory?

Prerequisites

  1. Azure Data Factory

Solution

  1. The “union()” function in ADF returns a collection that has all the items from the specified collections. So one can leverage this function to get the unique list from an Array.
  2. Let’s say we have a list of values in an Array variable
    Variables
  3. Using Set Variable activity, we can get the unique list from the Array.
    @union(variables('DuplicateArray'),variables('DuplicateArray'))

Output

Name

ADF JSON

{
    "name": "ReturnUnique",
    "properties": {
        "activities": [
            {
                "name": "Remove Duplicates",
                "type": "SetVariable",
                "dependsOn": [],
                "userProperties": [],
                "typeProperties": {
                    "variableName": "UniqueArray",
                    "value": {
                        "value": "@union(variables('DuplicateArray'),variables('DuplicateArray'))",
                        "type": "Expression"
                    }
                }
            }
        ],
        "variables": {
            "DuplicateArray": {
                "type": "Array",
                "defaultValue": [
                    "A1",
                    "B2",
                    "C3",
                    "A1",
                    "A5",
                    "B2"
                ]
            },
            "UniqueArray": {
                "type": "Array"
            }
        },
        "annotations": []
    }
}