Removing Unneeded Sitecore Versions with SPE

versions-header

It’s been a while since I had to look how best to manage Sitecore versions, but after seeing some performance issues related to lots of versions for some items on the CM this week It was time to revisit removing unneeded Sitecore versions.

Research

Amusingly my search turned up an SSE post with an answer from myself from about 5 years ago: https://sitecore.stackexchange.com/questions/507/limiting-version-numbers-copying-old-versions-to-archive-for-easy-access/525

Screenshot 2022-05-24 231555

Further Googling shows not much seems to have changed in that time. Some modules are however no longer supported or are compatible with Sitecore 9. I did find some links that gave me a few ideas though.

Our requirements were to find any items with more than 10 versions and then remove any versions other than the most recent ten unless an version is the currently published version or the version has been updated on the past 3 months. The idea behind this is that we wanted to remove only versions that were old and not updated recently but leave versions that are recent even if there are a lot of them for actively edited pieces of content.

The Script

Given the various modules available for managing versions are out of date and we only needed to this on an adhoc basis I decided to turn to trusty SPE to write a report which initially reported on items with 10 or more versions and then listed how many items to keep and how many to remove.

I was aware that Out of the box SPE has an report for item versions, so I used this as a basis for mine but added the check for if the version is the published version or has been edited within the past 3 months.

I then expanded on this to generate detailed reports of all the versions to keep and versions to remove.

Lastly added some options within the report to allow the item versions to be removed after the report has run an also if to archive the items to disk and/or the archive database table.

report-versions

The result of this is that I could just run this in report mode first for analysis and then with the ‘Archive Versions & Generate Reports’ option selected to also remove the un-needed items.

Here is the script:

If you wish to change the number of months to a longer or shorter period you can update this variable value: $months = -3

When running the report with ‘Archive Versions & Generate Reports’ option selected it will ask you to confirm before removing items to be on the safe side:

confirm-remove

Note that in newer versions of SPE the Remove-ItemVersion command archives items by default. However in the version we were using (5.1) this doesn’t happen so I added code to archive to the database before removing the item version.

Results

After the report has run an summary of results is shown and they can be downloaded in excel or csv format etc.

report-results

2 Other Spreadsheets are also automatically created and downloaded:

1. Versions to Keep – all the versions that will be kept for each item (these should correspond with the report above)

2. Versions to Remove – all the versions that should be removed for each item (these should correspond with the report above)

Running this in our production environment (after extensive testing) resulted in over 6,000 un-needed item versions being removed and over 7,000 being kept. This took around 40 minutes to run in total and the logs show info on what is happening.

Nearly halving our item versions means we should see some performance benefits and also it tidies up our database.

This also provides the added benefit of 3 reports we can reference if needed and the ability to restore versions from the archive if we need to in future for whatever reason. We plan to empty the archive in 6 months or so when we are sure these versions are not needed.

Hopefully someone else will find these scripts useful for managing item versions in Sitecore too. Feel free to edit it for your own use-case, please test carefully in a non-production environment though before running it.

Useful Links

There were some useful links I found regarding managing versions, including an SitecoreVersionPruner SPE script that I found after I’d written mine which looks really good:

https://github.com/danielrolfe/SitecoreVersionPruner

https://thesitecorist.net/2019/05/11/removing-old-versions-using-sitecore-scheduled-tasks/

https://briancaos.wordpress.com/2019/03/01/sitecore-delete-old-item-versions/

https://www.logicalfeed.com/posts/1195/sitecore-powershell-script-to-remove-all-language-versions-except-one

https://robearlam.com/blog/extending-sitecore-revolver-to-remove-old-versions

Removing Unneeded Tags from RTE fields with Sitecore PowerShell Extensions

bacon-575334_1280

This week I had an issue where some items imported from another system into Sitecore had invalid HTML in some of their RTE fields. These tags were breaking the SXA page content component that was being used.

I needed to a quick way to remove these tags without having to re-import the content. As usual SPE came to the rescue again and I was able to correct 100s of items in seconds.

The script has a function which takes an item and field name as a parameter:

Strip-InvalidTags -item $item -fieldName "Details"

I’m getting a number of profile items and calling this function in a loop. The function uses HTMLAgilityPack to look for ‘Script’ tag nodes (but can easily be modified to pick up different tags or multiple tags). If the script finds the tags is strips them and then saves the modified html back to the field.

Here is the script that I created:

Hopefully it’s useful for others who might need to do this.

Thanks again to the community for some useful ideas: https://sitecore.stackexchange.com/questions/20795/using-powershell-extensions-to-remove-empty-p-tags-from-all-rich-text-fields/20802