Content-MD5 checksums provide an end-to-end message integrity check of the content (excluding metadata) as it is sent to and returned from Swarm. A proxy or client can check the Content-MD5 header to detect modifications to the entity body while in transit. A client can provide this header to indicate Swarm should compute and check it as it is storing or returning the object data.

See SCSP Headers.

Client-Provided Content-MD5

During a POST or PUT, the client can provide the following Content-MD5 header as specified in section 14.15 of the HTTP/1.1 RFC:

Content-MD5 = "Content-MD5" ":" md5-digest

Where md5-digest is the base64 of the 128-bit MD5 digest (See RFC 1864 for more information).

The md5-digest is computed based on the content of the entity body, including any content coding applied, but not including any transfer-encoding applied to the message body.

Swarm-Provided Content-MD5

Another way to associate a Content-MD5 value with an object is to have Swarm compute the ContentMD5 for the body data of the request. Include the gencontentmd5 query argument in the request to perform this. Swarm returns the Content-MD5 as a header in the 201 Created response. Once computed, the Content-MD5 data is stored with the object and returned as a response header for any subsequent GET or HEAD requests. Note: the gencontentmd5 query argument replaces use of the "Expect: Content-MD5" request header, which is deprecated per RFC 2731. (v9.2)

Tip

The Swarm setting scsp.autoContentMD5Computation automates Content-MD5 hashing. The gencontentmd5 query argument or the deprecated Expect: Content-MD5 header on writes does not need to be included (although a separate Content-MD5 header may want to be supplied for content integrity checking). This setting is ignored wherever it is invalid, such as on a multipart initiate/complete or an EC APPEND. (v9.1)

Ranges - When including ?gencontentmd5 on a GET request with a Range header, any Content-MD5 header stored with the object is omitted in the response headers. Instead, a Content-MD5 of the selected range is returned as a trailing header to the GET request.

For details about Range headers, see section 14.35 (Range) in the HTTP/1.1 RFC.

Validation Failures

When SCSP reading operations request for a Content-MD5 hash validation and there is a hash mismatch, a storage node is removed of the Gateway's connection pool temporarily because of how Swarm reports a hash validation failure.

Storing Content-MD5 Headers

Content-MD5 headers are stored with the object metadata and returned on all subsequent GET or HEAD requests.

Content-MD5 and Replication

When providing the gencontentmd5 query argument in a request on a replicated object, the following applies:

Content-MD5 and Erasure-Coding

When providing the gencontentmd5 query argument in request on an erasure-coded object, the following applies:

Example Download Verification

You can verify the integrity of a download from Swarm by checking the Content-MD5 published in an object’s metadata with the base64 encoded MD5 digest of the downloaded object. An example of how this is performed using the ‘openssl’ utility is outlined below:

$ curl -sI https://support.cloud.datacore.com/tools/swarm-support-tools.tgz
HTTP/1.1 200 OK
Date: Tue, 10 Jan 2023 19:12:40 GMT
Gateway-Request-Id: A0A1788FF937057D
Server: CAStor Cluster/15.0.1
Via: 1.1 support.cloud.datacore.com (Cloud Gateway SCSP/7.10.2)
Gateway-Protocol: scsp
CAStor-application: CaringoTechSupport
Castor-System-CID: 664727e752ca7a48092c73699e909578
Castor-System-Cluster: gem.tx.caringo.com
Castor-System-Created: Mon, 09 Jan 2023 18:25:17 GMT
Castor-System-Name: swarm-support-tools.tgz
Castor-System-Version: 1673288717.693
Content-Type: application/x-www-form-urlencoded
Last-Modified: Mon, 09 Jan 2023 18:25:17 GMT
X-Last-Modified-By-Meta: tools+swarm
X-Owner-Meta: tools+swarm
Manifest: ec
ETag: "b5dea5b4048f21a0f99880873fa64865"
Castor-System-Path: /support.cloud.datacore.com/tools/swarm-support-tools.tgz
Castor-System-Domain: support.cloud.datacore.com
Volume: 1dc47666d09cdb27bd59cbb731d046ca
Content-MD5: EF8xHMmzt3xNjpksfRLo+A==
Content-Length: 28398358

$ curl -O https://support.cloud.datacore.com/tools/swarm-support-tools.tgz
  % Total    % Received % Xferd  Average Speed   Time    Time     Time  Current
                                 Dload  Upload   Total   Spent    Left  Speed
100 27.0M  100 27.0M    0     0  2928k      0  0:00:09  0:00:09 --:--:-- 2826k

$ cat swarm-support-tools.tgz | openssl dgst -md5 -binary | openssl enc -base64
EF8xHMmzt3xNjpksfRLo+A==