Versions Compared

Key

  • This line was added.
  • This line was removed.
  • Formatting was changed.
Table of Contents
minLevel1
maxLevel2
outlinefalse
typelist
printablefalse

...

Info

Info

Elasticsearch was previously used to store Historical Metrics historical metrics, but that has moved to Prometheus starting with Swarm 14. Gateway Content Metering stores csmeter indices in Elasticsearch, but that does not impact Elasticsearch hardware requirements as much as a Swarm Search Feed.

...

  • Provision the machines with CPUs with at least 4 cores and 64 GB of memory. Between faster processors or more cores, choose more cores.

  • Use Solidsolid-state drives (SSD), preferably local, not SAN, and never NFS. This is critical for S3, especially for rapid small object writes , and for the listing of buckets with millions of objects.

  • Let Support know if you are using hard disks which that do not handle concurrent I/O as well as SSDs:

    • Select high-performance server disks.

    • Use RAID 0 with a writeback cache.

    • Set index.merge.scheduler.max_thread_count to 1 , to prevent too many merges from running at once. 

      Code Block
      languagebash
      curl -X PUT <ES_SERVER>:9200/<SWARM_SEARCH_INDEX>/_settings \
      	-d '{ “index”: {“merge.scheduler.max_thread_count”: 1}}'
  • As with the storage cluster, choose similar, moderate configurations , for balanced resource usage.

...

RAM is key for Elasticsearch performance. Use these guidelines as a basis for capacity planning:

  • 64 GB of RAM per machine is optimal (recommended by Elasticsearch).

  • Dedicate half of the total RAM to the Java Virtual Machine (JVM) running Elasticsearch, but do not exceed 31 GB, for best performance. Heap assignment is automatic with 7.17.

  • Disable swapping of the Elasticsearch image.

...

The storage on the Elasticsearch servers is used to persist the shards of the Swarm Search. Follow these guidelines for capacity planning for the Swarm Search indices. 

  • Baseline metadata to support listing: 150 GB per 200 million objects

  • Full metadata to support ad-hoc searching: 300 GB per 200 million objects

  • Custom metadata to allocate additional storage in proportion if to indexing a large amount of custom metadata

These are unique objects, not replicas: ; how many Swarm replicas a Swarm object has is irrelevant to the ES servers. There is only one metadata entry for the object, no matter how many replicas of an object exist in the cluster.

...

  • Use SSDs: SSDs are critical for performance. With SSDs, verify the OS I/O scheduler is configured correctly.

  • Use RAID 0: Striped RAID increases disk I/O, at the expense of potential failure if a disk dies. Do not use mirrored or parity RAIDS, because replicas provide this functionality.

  • Do not use remote-mounted storage: such as NFS or SMB/CIFS; the latency negatively impacts performance.

  • Avoid virtualized storage, such as a SAN or EBS (Amazon Elastic Block Store). Even when SSD-backed, it is often slower than local instance storage and it conflicts with the purpose of replicas and sharding.

...

Elasticsearch clustering is designed to mitigate the impact of hardware and power failures, so long delays from refreshing the search data are not experienced. Determining how to invest and optimize hardware depends on how important metadata search and querying is are to the organization and how long these features can be offline while Elasticsearch rebuilds data.

These are the principles for making a configuration more disaster-proof:

  • Do not use and rely on a single Elasticsearch server. This introduces vulnerabilities and potential disruption of search capabilities, and it risks too little capacity to support all search needs. 

  • For power failure protection, deploy enough Elasticsearch servers to survive multiple server failures and distribute them across different power sources.

  • Set up Elasticsearch with multiple nodes spread equally among the subclusters if the cluster is divided in to into subclusters to match the power groups. This strategy improves the survivability of a power group failure.