Automation of Archiving Large Libraries - Part Two

Introduction 

 
Hi guys, let's try to get a better run process of my developed CSOM + PnP PowerShell scripts for seamless execution, and then move it to automation for a weekly scheduled process to handle all kinds of large libraries.
 
The main focus today on this Blog is to make the user avoid entering the Login Credentials during the run time of the PS scripts for a better high-level automation! So let's jump into the topic! This is a slightly better approach to my previous blog, so please follow all the pre-requisites mentioned there!
 
My auto-archival process includes 4 steps.
 
Document Library Scan of all the Files to be Archived
 
The library scan is performed with all Nested Folders and Files. It affects whichever has the Archive Flag manually set to 'True' OR scanned if they are 1 year old/n number of days old as per your Archival Strategy[LibScan.ps1].
  1. #Getting NestedFolder and Files  
  2. for Archival Process on a CSV File  
  3. #Load SharePoint CSOM Assemblies  
  4. write - host - f Yellow "Scanning Large Library with Nested Folder and Files Started"  
  5. Add - Type - Path "C:\Program Files\Common Files\Microsoft Shared\Web Server Extensions\16\ISAPI\Microsoft.SharePoint.Client.dll"  
  6. Add - Type - Path "C:\Program Files\Common Files\Microsoft Shared\Web Server Extensions\16\ISAPI\Microsoft.SharePoint.Client.Runtime.dll"  
  7. #Config Parameters  
  8. $SiteURL = "https://sampleharenet.sharepoint.com/sites/classictest"  
  9. $ListName = "PnPCopytoLib"  
  10. $CSVPath = "D:\LibraryDocumentsInventory.csv"  
  11. #Config Login Details with Password Protection  
  12. $User = "[email protected]"  
  13. $PWord = ConvertTo - SecureString - String "**Contra#0987**" - AsPlainText - Force  
  14. $Credential = New - Object - TypeName System.Management.Automation.PSCredential - ArgumentList $User, $PWord  
  15. Try {  
  16.     #Setup the context  
  17.     $Ctx = New - Object Microsoft.SharePoint.Client.ClientContext($SiteURL)  
  18.     $Ctx.Credentials = New - Object Microsoft.SharePoint.Client.SharePointOnlineCredentials($Credential.UserName, $Credential.Password)  
  19.     #Get the Document Library  
  20.     $List = $Ctx.Web.Lists.GetByTitle($ListName)  
  21.     #Define CAML Query to Get All Files  
  22.     $Query = New - Object Microsoft.SharePoint.Client.CamlQuery  
  23.     $Query.ViewXml = "@<View Scope='RecursiveAll'> <  
  24.         Query >  
  25.         <  
  26.         Where >  
  27.         <  
  28.         And >  
  29.         <  
  30.         Eq > < FieldRef Name = 'FSObjType' / > < Value Type = 'Integer' > 0 < /Value></Eq >  
  31.         <  
  32.         Or >  
  33.         <  
  34.         Eq > < FieldRef Name = 'ArchivalFlag' / > < Value Type = 'Choice' > Yes < /Value></Eq >  
  35.         <  
  36.         Lt > < FieldRef Name = 'Created' / > < Value Type = 'DateTime'  
  37.     IncludeTimeValue = 'True' > " + (get-date).adddays(-180).ToString("  
  38.     yyyy - MM - ddTHH: mm: ssZ ") + " < /Value></Lt >  
  39.         <  
  40.         /Or> <  
  41.         /And> <  
  42.         /Where> <  
  43.         /Query> <  
  44.         /View>"  
  45.     #powershell sharepoint online list all documents  
  46.     $ListItems = $List.GetItems($Query)  
  47.     $Ctx.Load($ListItems)  
  48.     $Ctx.ExecuteQuery()  
  49.     $DataCollection = @()  
  50.     #Iterate through each document in the library  
  51.     ForEach($ListItem in $ListItems) {  
  52.         #Collect data  
  53.         $Data = New - Object PSObject - Property([Ordered] @ {  
  54.             FileName = $ListItem.FieldValues["FileLeafRef"]  
  55.             RelativeURL = $ListItem.FieldValues["FileRef"]  
  56.             CreatedBy = $ListItem.FieldValues["Created_x0020_By"]  
  57.             CreatedOn = $ListItem.FieldValues["Created"]  
  58.             ModifiedBy = $ListItem.FieldValues["Modified_x0020_By"]  
  59.             ModifiedOn = $ListItem.FieldValues["Modified"]  
  60.             FileSize = $ListItem.FieldValues["File_x0020_Size"]  
  61.         })  
  62.         $DataCollection += $Data  
  63.     }  
  64.     $DataCollection  
  65.     #Export Documents data to CSV  
  66.     $DataCollection | Export - Csv - Path $CSVPath - Force - NoTypeInformation  
  67.     Write - host - f Green "Documents Data Exported to CSV!"  
  68. }  
  69. Catch {  
  70.     write - host - f Red "Error:"  
  71.     $_.Exception.Message  
  72. }  
Output
 
 
We will get all the List of Files to be Archived as per the above-highlighted conditions using a Where Clause at your Local Path, D:\LibraryDocumentsInventory.csv
 
It should contain the following columns:
  • FileName
  • RelativeURL
  • CreatedBy
  • CreatedOn
  • ModifiedBy
  • ModifiedOn
  • FileSize
It copies the Files marked for Archival whose details are captured on the CSV report.
 
It copies all the Scanned Files with their Relative Folder Paths from the Source Library to the Archive Target Library[CopyFiles.ps1].
  1. #Load SharePoint CSOM Assemblies  
  2. write - host - f Yellow "Nested Folder and Files Copying Started"  
  3. Add - Type - Path "C:\Program Files\Common Files\Microsoft Shared\Web Server Extensions\16\ISAPI\Microsoft.SharePoint.Client.dll"  
  4. Add - Type - Path "C:\Program Files\Common Files\Microsoft Shared\Web Server Extensions\16\ISAPI\Microsoft.SharePoint.Client.Runtime.dll"  
  5. #Config Login Details with Password Protection  
  6. $User = "[email protected]"  
  7. $PWord = ConvertTo - SecureString - String "Contra#0987" - AsPlainText - Force  
  8. $Credential = New - Object - TypeName System.Management.Automation.PSCredential - ArgumentList $User, $PWord  
  9. #Function to Copy a File  
  10. Function Copy - SPOFile([String] $SourceSiteURL, [String] $SourceFileURL, [String] $TargetFileURL) {  
  11.     Try {  
  12.         #Setup the context  
  13.         #$Ctx = New - Object Microsoft.SharePoint.Client.ClientContext($SourceSiteURL)  
  14.         #$Ctx.Credentials = New - Object Microsoft.SharePoint.Client.SharePointOnlineCredentials($Credentials.Username, $Credentials.Password)  
  15.         #Setup the context  
  16.         $Ctx = New - Object Microsoft.SharePoint.Client.ClientContext($SiteURL)  
  17.         $Ctx.Credentials = New - Object Microsoft.SharePoint.Client.SharePointOnlineCredentials($Credential.UserName, $Credential.Password)  
  18.         #Copy the File  
  19.         $MoveCopyOpt = New - Object Microsoft.SharePoint.Client.MoveCopyOptions  
  20.         $Overwrite = $True[Microsoft.SharePoint.Client.MoveCopyUtil]::CopyFile($Ctx, $SourceFileURL, $TargetFileURL, $Overwrite, $MoveCopyOpt)  
  21.         $Ctx.ExecuteQuery()  
  22.         Write - host - f Green $TargetFileURL " - File Copied Successfully!"  
  23.     }  
  24.     Catch {  
  25.         write - host - f Red "Error Copying the File!"  
  26.         $_.Exception.Message  
  27.     }  
  28. }  
  29. $RootSiteURL = "https://crosssharenet.sharepoint.com"  
  30. $SourceSiteURL = "https://crosssharenet.sharepoint.com/sites/classictest"  
  31. $TargetSiteURL = "https://crosssharenet.sharepoint.com/sites/testsitearchival"  
  32. $SourceSitePath = "/sites/classictest/"  
  33. $TargetSitePath = "/sites/testsitearchival/"  
  34. $SourceDocLibURL = "/PnPCopytoLib"  
  35. $TargetDocLibURL = "/FlowArchiveLib2"  
  36. #$TargetDocLibURL = "/FlowArchiveLib2/April_3rdWeek"  
  37. #$Credentials = Get - Credential  
  38. Connect - PnPOnline - Url $TargetSiteURL - Credentials $Credential  
  39. $CSVPath = "D:\LibraryDocumentsInventory.csv"  
  40. Import - Csv $CSVPath | ForEach - Object {  
  41.     $SourceFileURL = $RootSiteURL + $_.RelativeURL  
  42.     $temp = ($_.RelativeURL).Replace($SourceDocLibURL, $TargetDocLibURL)  
  43.     $temp = ($temp).Replace($SourceSitePath, $TargetSitePath)  
  44.     $TargetFileURL = $RootSiteURL + $temp  
  45.     $temp = ($temp).Replace($TargetSitePath, "")  
  46.     $temp = ($temp).Replace("/" + $_.FileName, "")  
  47.     if ($TargetDocLibURL - ne "/" + $temp) {  
  48.         Resolve - PnPFolder - SiteRelativePath $temp  
  49.     }  
  50.     #Call the  
  51.     function to Copy the File  
  52.     Copy - SPOFile $SourceSiteURL $SourceFileURL $TargetFileURL  
  53. }  
Precautions
 
Use Only Site Collection Admin/Global Admin Login details for logging-in while the script running is on progress.
 
Try to hard code with password-protected security string oriented Token Management on the above scripts for no End User manual inputting involvement.
 
Output
 
 
You will find all the listed files from that CSV report which have been selected for Archival created with Meta Data properties preserved on the Target Archive Library.
 
Remove all the Archived Files from Source Library after the above copying process is finished:
  1. #Config Login Details with Password Protection  
  2. $User = "[email protected]"  
  3. $PWord = ConvertTo - SecureString - String "Contra#0987" - AsPlainText - Force  
  4. $Credential = New - Object - TypeName System.Management.Automation.PSCredential - ArgumentList $User, $PWord  
  5. $SourceSiteURL = "https://crosssharenet.sharepoint.com/sites/classictest"  
  6. Connect - PnPOnline - Url $SourceSiteURL - Credentials $Credential  
  7. $CSVPath = "D:\LibraryDocumentsInventory.csv"  
  8. Import - Csv $CSVPath | ForEach - Object {  
  9.     Remove - PnPFile - ServerRelativeUrl $_.RelativeURL - force  
  10. }  
Unify all the above 3 Scripts to Automatically run them on a Scheduled PowerAutomaton Flow/ Azure Web Job/ Azure Functions
 
For Flow Approach >> Simply go to flow.microsoft.com >> Build a Scheduled Weekly/Monthly running Azure Automated Job[Created in Azure with all the above PS codes tested on Run Books] >> Deploy it once to successfully run as per the Business needs.
 
For the Azure Web Job Approach, follow here.
 
For Azure, Function Approach, follow here.
 
You can apply the above seamless Automated Process with Large Lists too that needs to be Archived.
 
Note
Since we have a buffer size limit of 100 MB for a Complete Flow-based Copying of Files we are using PnP + CSOM PowerShell to run evergreen unlimited calls no pricing/no size limitation/metadata preserving, etc.
 
Happy Modern SharePoint Large Library Automated Archiving for your business needs and inside your respective organizations! 
 
Cheers!