When we implemented the versioning feature, we anticipated that many operations on versioned objects would be based on previous versions. For example, a COPY operation on a previous version creates a new version. At that time, we chose to allow the reuse of EC segments between versions. This means that a COPY of an EC version creates a new version, but only a new manifest is created. The segment reuse means that for many use cases, there is a data savings in the cluster for using EC with versioning.

Everything comes at a cost. In this case, the health processor needs to do more work to determine if some manifest version still mentions a particular segment. Without versioning, the presence of a “final” delete marker is sufficient enough of a signal that there is no manifest that has the segment. In the versioning case, there is some element of doubt as to whether there may be some version of the manifest that wasn’t linked properly in the versioning chain, or that was offline at the wrong time, or there was some sort of network error and the manifest exists but couldn’t be found for some reason. In this case, we age out the segments by some number of consecutive examinations of the segment before actually deleting it. This means that segment reclamation can take 5 or more HP cycles in a versioned bucket. For many clusters, this may take a year or more.

This segment cleanup issue disproportionately impacts clusters with a high turnover of objects in one or more versioned bucket. Most Veeam buckets fit this profile.

We don’t have a silver bullet solution here, but there are a number of things to consider and monitor.