Swarm Storage 10.1 Release

New Features

  • Swarm 10 Performance — With Storage 10.1, the performance both for writes and for erasure-coded object reads is improved for Swarm 10's density-friendly single-IP architecture, the result of optimizations in how Swarm nodes write to volumes under the new design. (SWAR-8357) 

  • Memory Handling — Swarm has improved memory handling, especially with bursts and high loads, and 503 Service Unavailable responses are less likely. (SWAR-8335)

  • Hardware Diagnostics — This release includes a preview of the Prometheus Node Exporter, for monitoring and diagnostics on the machines in the Swarm cluster. Prometheus is an open-source systems monitoring and alerting toolkit. It allows viewing of statistics available for the system, even under failure conditions. Prometheus scrapes metrics from instrumented jobs, running rules over this data to record aggregated time series or to generate alerts. Grafana and other API consumers can allow visualizing the collected data. The new setting  metrics.enableNodeExporter enables Swarm to run the Prometheus node exporter on port 9100. As a preview, the settings and implementation are subject to change; for more about this preview, contact DataCore Support. (SWAR-8170)

  • Bulk Reformatting — Retiring volumes to implement encryption at rest requires reformatting and remounting the volumes. Contact DataCore Support for a utility to streamline this process. (SWAR-8088)

  • ES Cluster Configuration — The automation script for configuring Elasticsearch for Swarm generates the complete set of unique configuration files for each node in the Elasticsearch cluster. (SWAR-8028)

Additional Changes

These items are other changes and improvements including those from testing and user feedback.

  • OSS Updates — Storage 10.0 includes updates to third-party components. See Third-Party Components for 10.0 for the complete listing of packages and versions.

  • SCSP Errors — Several new error tokens and error improvements are added for this release. See Error Response Headers.

  • Blocked Feeds

    • Swarm can now detect the disappearance of the Elasticsearch index associated with a feed and mark the feed as Blocked; such feeds may need to be deleted and recreated. (SWAR-6885)

    • Blocked feeds are retried every 20 minutes, but changing the definition for a blocked feed now triggers an immediate attempt with the new definition, which may clear the blockage. (SWAR-8232)

    • The handling and reporting of feeds blocked due to internal software issues is improved. (SWAR-8267)

  • Fixed

    • The correct behavior for the network.ntpControlKey setting was restored. (SWAR-8371)

    • After an upgrade to version 10.0, the Chassis Details in the Swarm UI did not update the node listings correctly for a period of time. (SWAR-8352)

    • When SwarmFS was not in use, clicking on the NFS settings link in the Storage UI resulted in an error. (SWAR-8350)

    • The correct behavior for the disk light toggle in the Swarm UI is restored. (SWAR-8336)

    • The SNMP MIB entries for the largest stream (volLargestStreamMB and volLargestStreamUUID) are not populated. (SWAR-8331)

    • Rapid updates of objects written with replicate=immediate may result in some replicas not being found temporarily. (SWAR-8249)

    • Deletes of unnamed objects did not update the Elasticsearch index, leaving a stale entry. (SWAR-8218)

    • Deletes not processed by a feed within two weeks are not propagated to the feed's destination (Elasticsearch or the replication target). (SWAR-7950)

Upgrade Impacts

These items are changes to the product function requiring operational or development changes for integrated applications.

Impacts for 10.1

  • Upgrading Elasticsearch — Continue to use Elasticsearch 2.3.3 with Storage 10.1 until able to move to 5.6 (see Migrating from Older Elasticsearch). Support for ES 2.3.3 ends in a future release. Complete the upgrade to Elasticsearch 5.6 before upgrading to Gateway 6.0.

  • Configuration Settings — Run the Storage Settings Checker before any Swarm 10 upgrade to identify configuration issues.

    • metrics.enableNodeExporter=true enables Swarm to run the Prometheus node exporter on port 9100. (SWAR-8170)

  • IP address update delay — When upgrading from Swarm 9 to the new architecture of Swarm 10, note the "ghosts" of previously used IP addresses may appear in the Storage UI; these resolve within 4 days. (SWAR-8351)

  • Update MIBs on CSN — Before upgrading to Storage 10.x, the MIBs on the CSN must be updated. From the Swarm Support tools bundle, run the platform-update-mibs.sh script. (CSN-1872)

For Swarm 9 impacts, see Swarm Storage 9 Releases.

Watch Items and Known Issues

The following operational limitations and watch items exist in this release.

  • During a rolling reboot of a small cluster, erroneous CRITICAL errors may appear on the console, claiming EC objects have insufficient protection. These errors may be disregarded. (SWAR-8421)

  • The rate at which nodes retire is slower in Swarm 10.x than 9.6. (SWAR-8386)

  • When restarting a cluster of UEFI-booted (versus legacy BIOS) virtual machines, the chassis shut down but do not come back up. (SWAR-8054)

  • If the Elasticsearch cluster is wiped, the Storage UI shows no NFS config. Contract DataCore Support for help repopulating the SwarmFS config information. (SWAR-8007)

  • If a bucket is deleted, any incomplete multipart upload into the bucket leaves the parts (unnamed streams) in the domain. To find and delete them, use the s3cmd utility (search the Support site for "s3cmd" for guidance). (SWAR-7690)

  • Dell DX hardware has less chassis-level monitoring information available using SNMP. If this is a concern, contract DataCore Support. (SWAR-7606)

  • Logs showed the error "FEEDS WARNING: calcFeedInfo(etag=xxx) couldn't find domain xxx, which is needed for a domains-specific replication feed". The root cause is fixed; if such warnings are received, contact DataCore Support so the issue can be resolved. (SWAR-7556)

  • With multipath-enabled hardware, the Swarm console Disk Volume Menu may erroneously show too many disks, having multiplied the actual disks in use by the number of possible paths to them. (SWAR-7248)

Upgrading from 9.x

Important

Do not begin the upgrade until the following are completed:

  1. Plan upgrade impacts — Review and plan for this release's upgrade impacts (above) and the impacts for each of the releases since the running version. For Swarm 9 impacts, see Swarm Storage 9 Releases.

  2. Finish volume retires — Do not start any elective volume retirements during the upgrade. Wait until the upgrade is complete before initiating any retires.

  3. Run checker script — Swarm 10 includes a migration checker script to run before upgrading from Swarm 9; it reports configuration setting issues and deprecations to be addressed. (SWAR-8230) See Storage Settings Checker.

If upgrading from Swarm 8.x or earlier, contact DataCore Support for guidance.

  1. Download the correct bundle for the site. Swarm distributions bundle together the core components needed for implementation and later updates; the latest versions are available in the Downloads section on the DataCore Support Portal. 
    Two bundles are available:

    • Platform CSN 8.3 Full Install or Update (for CSN environments) — Flat structure for scripted install/update on a CSN (see CSN Upgrades).

    • Swarm 10 Software Bundle (Platform 9.x and custom environments) — Contains complete updates of all core components, organized hierarchically by component.

  2. Download the comprehensive PDF of Swarm Documentation matching the bundle distribution date, or use the online HTML version from the Documentation Archive.

  3. Choose the type of upgrade. Swarm supports rolling upgrades (a single cluster running mixed versions during the upgrade process) and requires no data conversion unless noted for a release. Upgrades can be performed without scheduling an outage or bringing down the cluster. Restart the nodes one at a time with the new version and the cluster continues serving applications during the upgrade process.

    • Rolling upgrade: Reboot one node at a time and wait for its status to show as "OK" in the UI before rebooting the next node.

    • Alternative: Reboot the entire cluster at once after the software on all USB flash drives or the centralized configuration location is updated.

  4. Choose whether to upgrade Elasticsearch 2.3.3 at this time. 

    • To upgrade to Elasticsearch 5.6 with an existing cluster, reindex Search data and migrate any Metrics data to be kept. See Migrating from Older Elasticsearch for details. (SWAR-7395) 

  5. Note these installation issues:

    • The elasticsearch-curator package may show an error during an upgrade, which is a known curator issue. Workaround: Reinstall the curator: (SWAR-7439)

      yum reinstall elasticsearch-curator
    • Do not install the Swarm Search RPM before installing Java. If Gateway startup fails with "Caringo script plugin is missing from indexer nodes", uninstall and reinstall the Swarm Search RPM. (SWAR-7688)

    • During a rolling upgrade from 9.0.x–9.2.x, intermittent "WriterMissingRemoteMD5 error token" errors may be seen from a client write operation through the Gateway or on writes with gencontentmd5 (or the equivalent). To prevent this, set autoRepOnWrite=0 during the upgrade and restore autoRepOnWrite=1 after it completes. (SWAR-7756)

  6. Review the Application and Configuration Guidance.

Note

Contact DataCore Support for new installs of Platform Server and for optional Swarm client components, such as SwarmFS Implementation, with separate distributions.

© DataCore Software Corporation. · https://www.datacore.com · All rights reserved.