Filtering Headers

A business requirement may exist to enable filtering of the optional HTTP response headers transmitted for GET and HEAD requests if Swarm is used to deliver content directly over the Internet. (v9.5)

Caution

Indiscriminate filtering of response headers, which is cluster-wide in scope, can break client applications. Do not filter headers if the client applications are object storage aware and are using SCSP or S3 (Content Gateway) to interact with Storage.

Filtering metadata headers on objects can cause problems for other applications that know how to work with object metadata, such as Content UI, SwarmFS, and FileFly.

Because the header filtering does add additional processing to Swarm's responses, best practice is to enable it only for a specific content delivery need:

  • Bandwidth needs to be conserved and as many bytes as possible need to be eliminated when serving content.

  • Enhanced security is needed and as little as possible about content and context needs to be revealed.

  • The target clients are web browsers instead of object storage-aware applications.

Important

Regardless of filtering, do not expose Swarm Storage directly on the Internet. Do not allow arbitrary requests, especially by unauthorized users. Some kind of HTTP request restrictions should always be present to prevent abuse by untrusted clients.

Header filtering is a Storage feature dynamically implemented without a cluster restart. The choice of filtering approaches follow:

  • Whitelist - list which non-required headers to retain, if any

  • Blacklist - list which non-required headers to remove, preserving all others

The lists are case-insensitive, and they can include system headers (such as "Castor-System-Owner").

Essential Headers

The following essential metadata headers are unaffected by Blacklisting and are always included when they are present on an object:

Allow, Authentication-Info, Authorization, Cache-Control, Connection, Content-Length, Content-MD5, Content-Range, Content-Type, Date, Expires, Keep-Alive, Location, Server, Trailer, Transfer-Encoding

Settings for Filtering

Filtering is disabled by default. These SCSP settings allow controlling which of the optional response headers are returned from the cluster:

scsp.filterResponseHeaders

none

Which method to use to filter HTTP response headers. Whitelist or blacklist setting must be defined before implementing that method. Valid values: none, whitelist, blacklist.

SNMP: filterResponseHeaders

scsp.filterResponseBlacklist

[]

Which headers to remove from HTTP GET and HEAD responses. List is comma-separated and case-insensitive.

SNMP: filterResponseBlacklist

scsp.filterResponseWhitelist

[]

Which headers to retain in HTTP GET and HEAD responses, removing all others. List is comma-separated and case-insensitive. Leave the brackets empty to have Swarm strip out all non-essential headers.

SNMP: filterResponseWhitelist

Set these values using the Storage UI, or use SNMP or cURL:

curl -i http://$SCSP_HOST:91/api/storage/clusters/<cluster-name>/settings/scsp.filterResponseWhitelist -XPUT -d {"value": ["key1","key2"]} curl -i http://$SCSP_HOST:91/api/storage/clusters/<cluster-name>/settings/scsp.filterResponseHeaders -XPUT -d {"value": "whitelist"}

Sample Output

Following are examples of how responses can appear with and without filtering applied. Swarm includes the Castor-System-Headers-Filtered: True header with every response that has been filtered by a whitelist or blacklist.

Target of GET

Headers not Filtered

Headers Filtered

Target of GET

Headers not Filtered

Headers Filtered

Missing
Object

$ curl -i "172.16.15.180/11111111111111111111111111111111"

HTTP/1.1 404 Not Found

Castor-System-Error-Token: NotFound3

Castor-System-Error-Text: Existing object not found in cluster.

Castor-System-Error-Code: 404

Castor-System-Cluster: CAStorCluster

Content-Length: 83

Content-Type: text/html

Date: Fri, 30 Nov 2018 16:27:36 GMT

Server: CAStor Cluster/9.6.a

Allow: HEAD, COPY, GET, SEND, PATCH, PUT, RELEASE, POST,

    HOLD, GEN, APPEND, DELETE

Keep-Alive: timeout=14400

<html><body><h2>Swarm Storage Error</h2><br>

     Requested stream was not found</body></html>

$ curl -i "172.16.15.179/11111111111111111111111111111111"

HTTP/1.1 404 Not Found

Castor-System-Headers-Filtered: True

Content-Length: 83

Content-Type: text/html

Date: Fri, 30 Nov 2018 16:29:22 GMT

Server: CAStor Cluster/9.6.singleip

Allow: HEAD, COPY, GET, SEND, PATCH, PUT, RELEASE, POST,

     HOLD, GEN, APPEND, DELETE

Keep-Alive: timeout=14400

<html><body><h2>Swarm Storage Error</h2><br>

     Requested stream was not found</body></html>

Immutable
Object

$ curl -i "172.16.15.178/7b9a25bcd48afac3156a89212859c62c"

HTTP/1.1 200 OK

Castor-System-Cluster: CAStorCluster

Castor-System-Created: Fri, 30 Nov 2018 16:31:04 GMT

Content-Length: 0

Last-Modified: Fri, 30 Nov 2018 16:31:04 GMT

Etag: "7b9a25bcd48afac3156a89212859c62c"

Volume: b9ec90023e27941147b3ce6fb2ed54bd

Date: Fri, 30 Nov 2018 16:32:13 GMT

Server: CAStor Cluster/9.6.a

Keep-Alive: timeout=14400

$ curl -i "172.16.15.179/7b9a25bcd48afac3156a89212859c62c"

HTTP/1.1 200 OK

Content-Length: 0

Castor-System-Headers-Filtered: True

Date: Fri, 30 Nov 2018 16:31:25 GMT

Server: CAStor Cluster/9.6.singleip

Keep-Alive: timeout=14400

Named
Object

$ curl -i "172.16.15.180/bucket/stream?domain=domain" -I

HTTP/1.1 200 OK

Castor-System-CID: 84c1cbf7d33aec1feec4d4dd11225b87

Castor-System-Cluster: CAStorCluster

Castor-System-Created: Fri, 30 Nov 2018 16:33:44 GMT

Castor-System-Name: stream

Castor-System-Version: 1543595624.202

Content-Length: 0

Last-Modified: Fri, 30 Nov 2018 16:33:44 GMT

Etag: "46ce386cdc13828d7d8d68ee20aac58d"

Castor-System-Path: /domain/bucket/stream

Castor-System-Domain: domain

Volume: 0a9a7ed07b5f86520b096fb0ef824846

Date: Fri, 30 Nov 2018 16:34:21 GMT

Server: CAStor Cluster/9.6.a

Keep-Alive: timeout=14400

$ curl -i "172.16.15.179/bucket/s?domain=x" 

HTTP/1.1 200 OK

Content-Length: 0

Castor-System-Headers-Filtered: True

Date: Fri, 30 Nov 2018 16:33:48 GMT

Server: CAStor Cluster/9.6.singleip

Keep-Alive: timeout=14400





© DataCore Software Corporation. · https://www.datacore.com · All rights reserved.