SharePoint Online is a cloud-based service that allows users to create, share, and collaborate on documents, websites, and various other content. One of the features of SharePoint Online is version history, a functionality that tracks and stores the changes made to a document or list item over time. Version history can be useful for restoring previous versions, comparing changes, or auditing the document history.
However, version history can also consume a lot of storage space and make the document library slow to load. If you have a large number of documents with many versions, you may want to clean up the version history and keep only the most recent or important versions. This can help you save storage space, improve performance, and reduce clutter.
In this article, we will show you how to use PowerShell to clean up the version history in a SharePoint Online document library. We will use the PnP PowerShell module, which is a set of PowerShell commands that can perform common tasks on SharePoint Online. You can install the PnP PowerShell module from the PowerShell Gallery.
Prerequisites
Before you run the script, you need to have the following:
- A SharePoint Online site with a document library that contains documents with version history. In this example, we will use a site with the URL “https://XXXXXX.sharepoint.com/sites/test” and a document library named “Documents”.
- A user account with permissions to access and modify the SharePoint Online site and the document library. In this example, we will use an account with an email address.
- The PnP PowerShell module installed on your computer. You can install it by running the command Install-Module -Name PnP.PowerShell -Force in a PowerShell window.
- A new column in the document library named “Delete old version.” This column should be of type “Choice,” offering the options “Yes” and “No.” Any file marked with “Yes” in this column will be earmarked for the removal of old versions.
The Script
The script below will perform the following steps:
- Define parameters for the site URL, list name, and the number of versions to retain for each file (current main version plus the number of versions set in script).
- Use the Connect-PnPOnline cmdlet to connect to the SharePoint site via interactive authentication (-Interactive is used in case MFA is required. If MFA is not required, –UseWebLogin can be used as a more convenient option).
- Use the Get-PnPList cmdlet to get the document library object by its name.
- Use the Get-PnPContext cmdlet to get the current context object, enabling the performance of various operations (e.g., retrieving items, updating metadata, or deleting files) used to execute queries against the SharePoint site.
- Use the Get-PnPListItem cmdlet to get all file items from the document library, excluding folders. The cmdlet also specifies the fields to retrieve, sets a page size for batching, and incorporates a script block to show operation progress.
- Filter items based on file type and the custom column named Deleteoldversion, which indicates whether old file versions should be deleted. The script stores the filtered items in a variable named $DocumentItems.
- Loop through each item in $DocumentItems and perform the following actions:
- The script gets the file object and the versions collection from the item object.
- The script loads both the file and the versions objects into the context object and executes the query to retrieve their properties.
- The script computes the number of versions to delete by subtracting the number of versions to keep from the total number of versions.
- If the calculated number of versions to delete is greater than zero, the script proceeds to delete the versions. The default sorting order for versions is from oldest to newest.
- The script loops through the versions collection and checks whether the version is the current version. If a version is identified as the current version, it increments the counter ($VersionCounter) and skips the deletion for that version, logging a message indicating that the current version is being retained. If a version is not the current version, it proceeds with deleting that version, logging a message indicating the version label being deleted.
- The script executes a query to commit the changes to the SharePoint site.
- The script displays a message on the console indicating that the version history is cleaned for the file.
- The script increments the counter variable by one.
- Loop through each item in $DocumentItems and perform the following actions:
The script is as follows:
#Parameters
# Replace "https://xxxxxx.sharepoint.com/sites/test" with your site URL.
$SiteURL = "https://xxxxxx.sharepoint.com/sites/test"
#Replace “Document” with your library name.
$ListName = "Documents"
#Replace “3” with number of versions you want to keep.
$VersionsToKeep = 3
#Connect to PnP Online
Connect-PnPOnline -Url $SiteURL -Interactive
#Get the Document Library
$List = Get-PnPList -Identity $ListName
#Get the Context
$Ctx= Get-PnPContext
$global:counter=0
#Get All Items from the List - Get 'Files
$ListItems = Get-PnPListItem -List $ListName -Fields FileLeafRef, File_x0020_Type, Deleteoldversion -PageSize 2000 -ScriptBlock { Param($items) $global:counter += $items.Count; Write-Progress `
-PercentComplete ($global:Counter / ($List.ItemCount) * 100) -Activity "Getting Files of '$($List.Title)'" `
-Status "Processing Files $global:Counter of $($List.ItemCount)";} | Where {($_.FileSystemObjectType -eq "File")}
# Filter for specific file types and Delete old version column value
$DocumentItems = $ListItems | Where-Object { $_["File_x0020_Type"] -in ("docx", "xlsx", "pptx") -and $_["Deleteoldversion"] -eq "Yes"}
$TotalFiles = $DocumentItems.count
$Counter = 1
ForEach ($Item in $DocumentItems)
{
#Get File Versions
$File = $Item.File
$Versions = $File.Versions
$Ctx.Load($File)
$Ctx.Load($Versions)
$Ctx.ExecuteQuery()
Write-host -f Yellow "Scanning File ($Counter of $TotalFiles):"$Item.FieldValues.FileRef
$VersionsCount = $Versions.Count
$VersionsToDelete = $VersionsCount - $VersionsToKeep
If($VersionsToDelete -gt 0)
{
write-host -f Cyan "`t Total Number of Versions of the File:" $VersionsCount
$VersionCounter= 0
#Delete versions
For($i=0; $i -lt $VersionsToDelete; $i++)
{
If($Versions[$VersionCounter].IsCurrentVersion)
{
$VersionCounter++
Write-host -f Magenta "`t`t Retaining Current Major Version:" $Versions[$VersionCounter].VersionLabel
Continue
}
Write-host -f Cyan "`t Deleting Version:" $Versions[$VersionCounter].VersionLabel
$Versions[$VersionCounter].DeleteObject()
}
$Ctx.ExecuteQuery()
Write-Host -f Green "`t Version History is cleaned for the File:"$File.Name
}
$Counter++
}
How to Run the Script
To run the script successfully, follow these steps:
- Open a PowerShell window and navigate to the folder where you saved the script file. In this example, we saved the script file as CleanUpVersionHistory.ps1 in the C:\Scripts folder.
- Run the script by typing .\CleanUpVersionHistory.ps1 and press Enter.
- When prompted, enter the user account credentials to connect to the SharePoint Online site. In this example, we enter username (email) and the password.
- Wait for the script to complete and check the output. You should see the progress of the script and the messages indicating which files and versions are scanned, deleted, or retained.
- Verify the results by opening the document library in SharePoint Online and checking the version history of the files. You should see that only the specified number of versions are kept, and the rest are deleted.
Conclusion
In this article, we showed you how to use PowerShell to clean up the version history in a SharePoint Online document library. This can help you save storage space, improve performance, and declutter your library. You can modify this user-friendly script to meet your needs and preferences, such as changing the parameters, filtering the files, or logging the results.
Interested in what more our team can do for you? Book a session.