Find Duplicates In SharePoint Metadata Database Using PowerShell

Welcome to a blog where we will find the duplicates in SharePoint metadata database, using PowerShell in the central admin.

Sometimes, we need to clean the metadata as the duplicate values are created when the multiple users update the metadata terms in central admin.

Here, I have a simple script for you to run with the minimal inputs from your end to identify the duplicates on your term store.

Let’s see how can we do it.

Steps

  1. Open Windows PowerShell Modules as an administrator.



  2. Paste the code, mentioned below as .ps1 file and execute it. 
  1. #Add the PowerShell snap in code  
  2. Add-PSSnapIn Microsoft.SharePoint.PowerShell -ErrorAction SilentlyContinue | Out-Null  
  3. #Provide the site collection for the script to refer the Metadata database from the central admin  
  4. $siteCollectionUrl = "http://devtest/"  
  5. $web = "http://devtest/"  
  6. #Read the metadata from SharePoint  
  7. $site =new-object Microsoft.SharePoint.SPSite($siteCollectionUrl)  
  8. $session = New-Object Microsoft.SharePoint.Taxonomy.TaxonomySession($site)  
  9. $termStore = $session.TermStores[0]  
  10. #Load the Term group  
  11. $group = $termStore.Groups["Testgroup"]  
  12. #Load the Term set  
  13. $termSet = $group.TermSets["Testset"]  
  14. $terms = $termSet.GetAllTerms()  
  15. $Array = @()  
  16. $newarray = @()  
  17. $emailarray = @()  
  18. foreach ($term in $terms)  
  19. {  
  20.   $varval = $term.Name  
  21.   $lbl = $term.Labels[1].Value;  
  22.   $Array += $varval        
  23.    }  
  24. #Check for duplicate Test set in SharePoint Metadata database       
  25.   
  26. $ht = @{}  
  27. $Array | foreach {$ht["$_"] += 1}  
  28. $ht.keys | where {$ht["$_"] -gt 1} | foreach {#write-host "Duplicate Sites found $_"  
  29. $newarray += $_  
  30. }   
  31. foreach($obj in $newarray)  
  32. {  
  33. foreach ($term in $terms)  
  34. {  
  35.   if ($term.Labels[1].Value -eq $obj)  
  36.   {  
  37.   $emailarray += $term.Name    
  38. write-host $emailarray  
  39.   }  
  40. }  
  41. }  
Pre-Requisites 
  1. Get the site collection URL.
  2. Get the Web.
  3. Get the term group name
  4. Get the term set, where you want to find the duplicates.

The script, mentioned above, the first gets the group “Test group” and the term “Testset”, where it reads the terms of “Testset” and assigns it to an array, where the array compares and finds the duplicates and results in the $emailarray as an output.

The script is so quick that it will hardly take any effort and time.