In SWAR-9430, we have discovered that versions of Swarm prior to 11.1 (when we moved to python3) URL encode certain characters differently. Tilde (~) is one of the impacted characters. This means that objects with tildes in their names written prior to 11.1 may not be accessible by name in releases 11.1 or later. For those interested in technical details, python moved from using RFC 2396 to RFC 3986 for quoting URL strings. This means that the lookup of a name with a tilde will not find objects with tildes written earlier.
These impacted objects are still valid Swarm objects, fully protected by the cluster, but they may not be accessible by name. This note gives instructions on how to make the objects accessible again by name. We assume direct communication to the Swarm cluster and using the IP address of a node in the cluster. This procedure also requires Swarm 15.0 or later.
Generally, customers will know the name of the expected object and the name will appear in a bucket listing query. If you do this, find the "hash" parameter for the object. That is the object's etag.
Use the etag=true query argument. Look for the Castor-System-Name header and the Castor-System-CID header. If the object is invisible due to SWAR-9430, the Castor-System-Name header will likely have ā%7Eā in it in lieu of the expected ~.
The NID is a uuid-like value that is computed for a named object. Use the pattern below in python3 to determine the NID for your situation.
python3 Python 3.7.6 (tags/v3.7.6:43364a7ae0, Dec 19 2019, 00:42:30) [MSC v.1916 64 bit (AMD64)] on win32 Type "help", "copyright", "credits" or "license" for more information.
>>> name = "mydir/hello%20world.html"
>>> cid = "d389dee12f943e6488e7c4db988c831c"
>>> import hashlib
>>> import binascii
>>> binascii.hexlify(hashlib.md5((cid + name).encode()).digest()).decode()
'6504693e2cef672c1c5d316002b98991'
In the example above, Castor-System-CID = d389dee12f943e6488e7c4db988c831c
and Castor-System-Name = mydir/hello%20world.html
(which does NOT have the tilde issue). The NID is 6504693e2cef672c1c5d316002b98991
.
There is a script in the Support bundle that will help collect the NID as well if you only know the name of the stream. Update your support tools (./updateBundle.sh) in your Support tools directory and you can run this with the proper SCSP_HOST (any Swarm node), bucket, filename, and domain. You must have python3 and requests installed (pip3 install requests) on the system this is run on.
python3 nidtool.py -n "$SCSP_HOST/BUCKETNAME/FILENAME?domain=DOMAIN" |
curl -I --location-trusted "http://192.168.1.11/6504693e2cef672c1c5d316002b98991?isnid"
Note the isnid
query argument. Verify that the response has the expected etag.
curl -i -X COPY --location-trusted "http://192.168.1.11/6504693e2cef672c1c5d316002b98991?isnid&newname=newname-with-~&preserve"
Note the isnid
query argument. The newname
query argument gives the new name of this object. Do not perform URL encoding here! Finally, the preserve
query argument keeps all of the headers of the original object. This request should give a 201 response.
curl -I --location-trusted "http://192.168.1.11/mybucket/newname-with-~&domain=mydomain"
If it does not. Perform another COPY (this time by name, with the preserve query argument) on the object. Check the listing again. The object should now exist with the proper name and be listed.