Content Integrity Assurance

Swarm provides methods for allowing applications to obtain and validate integrity guarantees on the stored data. Integrity is an independently verifiable guarantee the data returned for a given name or UUID is the same data stored using that name or UUID, perhaps many months or years in the past. This is performed by hashing the data using a cryptographic hash algorithm.

Content metadata is not included in the hash. If the application stores the name or UUID and the associated hash value, these can be used later to verify the content has not changed, either through accidental or malicious means.

Integrity Seals

An integral seal is a URL containing the object name or UUID, the hash value, and the type of hash algorithm used for the computations.

Direct to Swarm

Integrity Seal upgrades cannot be performed through Content Gateway. Request them directly from the back-end Swarm cluster.

An application can request an integrity seal when it performs a WRITE by including a hashtype query string.

Example of a Hashtype Request
POST http://company.cluster.com/?hashtype=md5 HTTP/1.1

These are the current allowable hash types:

  • md5

  • sha1

  • sha256

  • sha384

  • sha512

Swarm replies with a 201 (Created) response that includes a location header with a URL that can later be used to retrieve the data after creating the object and assigning a name or UUID.
In addition to the host and name or UUID, the URL includes the hash type and value computed from the content object. This URL, including the triple name or UUID, hash type, and hash, is known as the content object integrity seal.

Example of a complete integrity seal embedded in a Location header
Location: http://129.69.251.143/41A140B5271DC8D22FF8D027176A0821 ?hashtype=md5 &hash=7A25E6067904EAC8002498CF1AE33023

Validating Reads

An integrity seal can be used in a subsequent READ request to validate the data stored in a storage cluster (any cluster). By supplying the URL returned in the Location header from the WRITE request (perhaps replacing the host address if connected to a different cluster or node), the application can ask Swarm to validate while reading the data.

Example of validation with read
GET http://129.69.251.143/41A140B5271DC8D22FF8D027176A0821 ?hashtype=md5 &hash=7A25E6067904EAC8002498CF1AE33023 HTTP/1.1

When Swarm receives such a READ request, it recomputes the hash of the stored content using the supplied hash type and compares the computed hash with the hash value in the integrity seal.

  • Match - The hashes match if the content was not modified or corrupted. Swarm returns the object with the computed digest as a trailing Location header.

  • No match - Swarm drops the connection before sending the object content at the end of the request if the two values do not match. 

Because the hash algorithms are published and well-known, users and third parties can independently validate an object stored by Swarm by reading the contents, computing the hash value, and comparing it with the hash value in the seal. By publishing an integrity seal when it is created, it can be verified the stored content is not modified and it has always been associated with the same UUID.

Important

Range headers are not compatible with integrity seals. The connection may be closed prematurely if the seal is incorrect.

Application Initiated Hash Upgrading

Occasionally, cryptographers and mathematicians may defeat a cryptographic algorithm, making it possible for hackers to generate different content that has exactly the same hash value as previously-stored content. This issue occurred with the md5 and sha1 algorithms, but not the sha256, sha384, or sha512 algorithms.

Unlike other fixed content storage solutions, Swarm allows a user or application to upgrade a hash algorithm for an existing individual integrity seal. This is performed by issuing a READ request with the name or UUID, the current hash type and hash, and specifying a different, presumably stronger, hash type in the newhashtype query parameter.

Important

Upgrade the hash promptly before any exploit of the old algorithm becomes well known and available.

Example of hash upgrading

This READ request first validates the given integrity seal, then reseal it by wrapping the content in the new, upgraded hash algorithm – sha256 in the example. Swarm sends a 200 OK response but drops the connection prior to sending the object content if the requested object fails to validate against the integrity seal. A new integrity seal is returned with the new hash type and hash value if the object validates properly.

© DataCore Software Corporation. · https://www.datacore.com · All rights reserved.