This section provides information specific to running Swarm Storage with Gateway. Install and configure Swarm, the storage cluster (storage nodes running on dedicated hardware) before proceeding.

Network Placement

When deployed with Gateway, the storage nodes should be placed on a network subnet not directly accessible to client applications. All user communications with the storage cluster must go through the Gateway.

Caution

If users are allowed to communicate directly with the storage cluster nodes, they may bypass access security, the business rules for content metadata, and audit logging performed by the Gateway and may render content in the cluster unusable to the Gateway.

Only allow direct access to the storage cluster nodes under highly controlled circumstances, such as administrator-only operations or trusted applications.

Domain Management

The Swarm cluster provides for logical separation of content among multiple tenants through the use of storage domain names. Gateway has the following requirements beyond those for a baseline storage deployment and client usage.

An administrative domain must be created in the storage cluster.
Storage domains must adhere to IANA naming standards (valid DNS names).
Client applications should specify a storage domain in every request (if not, the request goes to the default domain, with enforceTenancy=True).

The storage domain name for an operation is specified by the client application according to the following precedence from highest to lowest:

SCSP domain=X query argument
HTTP X-Forwarded-Host header
HTTP Host header

Storage domains in Swarm must resolve to least one IP address ("A" record) for client applications to make use of the Host header to identify the storage domain with most HTTP/1.1 libraries. The resolved IP address should be for a Gateway or some other front-end network appliance such as a load balancer if applicable. Using a DNS round-robin with IP addresses is a valid configuration to use if there are multiple Gateway servers.

This is an example of a BIND 9 zone file implementing a wildcard of all storage domains within the cloud.example.com parent DNS domain and points them to the IP address 10.100.100.100.

$TTL 600 @ IN SOA cloud.example.com. dnsadmin.example.com. (
    2016070201 ; Serial number
    4H     ; Refresh every 4 hours
    1H     ; Retry every hour
    2W     ; Expire after 2 weeks
    300 )  ; nxdomain negative cache time of 5 minutes
IN NS ns1.example.com.
* IN A 10.100.100.100

In the example zone file, 10.100.100.100 is the IP address used by client applications to communicate with the Gateway or a front-end load balancer. The names hydrogen2.cloud.example.com and oxygen.cloud.example.com both resolve to the same IP address.

Elasticsearch Servers

When using the S3 storage protocol, the metadata search service must be accessible to the Gateway servers.

When deployed with Gateway, like the storage nodes, the typical placement is on a network subnet not directly accessible to the client applications. There are no end-user supported API calls directly to the metadata search service.

Listing Consistency

Search feeds show eventual consistency as content changes, but enabling the https://perifery.atlassian.net/wiki/spaces/public/pages/2443810201 [s3] option enhancedListingConsistency improves the search-after-create response to the client applications using the Gateway.

Configuration Requirements

Use these Swarm configuration settings and adhere to the following operational changes when using Swarm Storage with Gateway. These configuration changes refer to the configuration file(s) for Swarm.

CSN: This is the cluster-wide file: /var/opt/caringo/netboot/content/cluster.cfg
Platform Server: This is the cluster-wide file used to deploy, which is located by default here: /etc/caringo/cluster.cfg
No Platform Server: This is the node-specific configuration file: node.cfg

Caution

Failure to use these settings and operational changes can prevent Gateway from working properly with the storage cluster.

Requirement	Description

Requirement	Description
Optimize GETs	With Swarm 12.0 and higher, a setting can be added to improve performance through Gateway. Enable `scsp.enableVolumeRedirects` to permit Gateway to redirect GET requests to volume processes. These redirects increase efficiency, especially with reading small objects. `scsp.enableVolumeRedirects = True`
Enable an EC encoding	S3 multipart (large file) writes fail if erasure coding is not configured; define an ecEncoding if using S3. `policy.ecEncoding = {k:p}` See https://perifery.atlassian.net/wiki/spaces/public/pages/2443812150
Clear legacy settings	Unless needed for backwards compatibility (because untenanted objects are used in the cluster and do not need S3), enable tenancy for unnamed objects, which verifies every object is written to a domain (see https://perifery.atlassian.net/wiki/spaces/public/pages/2443821769): Set it to `True` and reboot the cluster if this was set to `False`.
Storage Domain Management	Only create and manage storage domains through the Content UI or programmatically through the Gateway's management API. The cluster configuration contains `security.noauth=False` if storage domain management displays in the legacy Admin Console (port 90), which is not supported by Content Gateway. Set it to `True` and reboot the cluster. Troubleshooting: If the Content UI reports "Page Not Found: The original bucket to which this collection refers cannot be found or has been replaced", it is likely the domain was created by the legacy Admin Console (port 90) and contains the legacy `Castor-Authorization` header. DataCore Support for help correcting the domain.

Swarm Documentation

Configuring Swarm for Gateway

Network Placement

Caution

Domain Management

Elasticsearch Servers

Listing Consistency

Configuration Requirements

Caution