Elasticsearch requires configuration and settings file changes made consistently across the Elasticsearch cluster.

On each Elasticsearch node, run the provided configuration script (/usr/share/caringo-elasticsearch-search/bin/configure_elasticsearch_with_swarm_search.py), which automates the configuration changes described below.
Resume the installation, to turn on the service if no customizations below are needed: Installing Elasticsearch
Proceed as follows if any settings need to be customized, such as to change Elasticsearch's path.data (data directory). Edit the configuration file directly and update log files accordingly.

Customizing Elasticsearch

The paths provided are relative to the Elasticsearch installation directory, which is assumed to be the working directory.

Caution

Errors in adding and completing these settings can prevent the Elasticsearch service from working properly.
Adjust all references to the Elasticsearch's path.data location below to reflect the new location if customized from the default.

Elasticsearch Config File

Edit the Elasticsearch config file: /etc/elasticsearch/elasticsearch.yml

cluster.name: <ES·cluster·name>

Provide the cluster a unique name. Do not use periods in the name.

Important

It must differ from the cluster.name of the 1.7.1 cluster if one exists to prevent conflict to prevent conflict.

node.name: <ES·node·name>

Setting node.name is optional. Elasticsearch supplies a node name if not set. Do not use periods in the name.

network.host: <ES·host>

Assign a specific hostname or IP address, which requires clients to access the ES server using that address. Update /etc/hosts if using a hostname. Defaults to the special value, _site_.

Metrics requirement

The Elasticsearch host for Metrics in /etc/caringo-elasticsearch/metrics/metrics.cfg must match if the Elasticsearch host is configured to a specific hostname or IP address. The host in metrics.cfg can be a valid IP address or hostname for the Elasticsearch server if network.host is configured in elasticsearch.yml to be "_site_”.

discovery.zen.ping.unicast.hosts: ["es2", "es3"]

Set to the list of node names/IPs in the cluster, include ES servers. Multicast is disabled by default.

discovery.zen.minimum_master_nodes: 3

Set to (number of master-eligible nodes / 2, rounded down) + 1

Prevents split-brain scenarios by setting the minimum number of ES nodes online before deciding on electing a new master.

gateway.expected_nodes: 4

Add and set to the number of nodes in the ES cluster. Recovery of local shards starts as soon as this number of nodes have joined the cluster. It falls back to the recover_after_nodes value after 5 minutes. This example is for a 4-node cluster.

gateway.recover_after_nodes: 2

Set to the minimum number of ES nodes started before going in to operation status. This example is for a 4-node cluster.

index.max_result_window: 50000

Add to support queries with very large result sets (it limits start/from and size in queries). Elasticsearch accepts values up to 2 billion, but more than 50,000 consumes excessive resources on the ES server.

index.translog.sync_interval: 5s

For best performance, set how often the translog is fsynced to disk and committed, regardless of write operations.

index.translog.durability: async

For best performance, change to async so ES fsyncs and commit the translog in the background every sync_interval. In the event of hardware failure, all acknowledged writes since the last automatic commit are discarded.

bootstrap.mlockall: true

Set to lock the memory on startup to guarantee Elasticsearch does not swap (swapping leads to poor performance). Verify enough system memory resources are available for all processes running on the server.

The RPM installer makes these edits to/etc/security/limits.d/10-caringo-elasticsearch.conf to allow the elasticsearch user to disable swapping and to increase the number of open file descriptors:

# Custom for Caringo Elasticsearch and CloudGateway
elasticsearch soft nofile 65535
elasticsearch hard nofile 65535
# allow user 'elasticsearch' mlockall
elasticsearch soft memlock unlimited
elasticsearch hard memlock unlimited

threadpool.bulk.queue_size: 1000

Add to increase the indexing bulk queue size to compensate for bursts of high indexing activity that can exceed Elasticsearch’s rate of indexing.

script.inline: true
script.indexed: true

(SwarmNFS users only) Add to support dynamic scripting.

http.cors.enabled: true
http.cors.allow-origin: "*"

Add to support metrics in the Swarm Storage UI.

path.data: <path·to·data·directory>

By default, path.data goes to /var/lib/elasticsearch with the needed ownership. Choose a separate, dedicated partition of ample size to move the log directory, and assign ownership of the directory to the elasticsearch user:

chown -R elasticsearch:elasticsearch <path- to- data- directory>

Systemd (RHEL/CentOS 7)

Create a systemd override file for the Elasticsearch service to set the LimitMEMLOCK property to be unlimited.

Create the override file:

 /etc/systemd/system/elasticsearch.service.d/override.conf

Add this content:
[Service] LimitMEMLOCK=infinity
Load the override file (the setting does not take effect until the next reboot):
sudo systemctl daemon-reload

Environment Settings

Edit the environmental settings: /etc/sysconfig/elasticsearch

`MAX_OPEN_FILES`	Set to `65535`
`MAX_LOCKED_MEMORY`	Set to `unlimited` (prevents swapping)
`ES_HEAP_SIZE`	Set to half the physical memory on the machine, but not more than 31 GB.

Logging

To customize the logging format and behavior, adjust its configuration file: /etc/elasticsearch/logging.yml

Logging has the needed ownership in the default location. Choose a separate, dedicated partition of ample size to move the log directory, and assign ownership of the directory to the elasticsearch user:
chown -R elasticsearch:elasticsearch <path- to- log- directory>

Best practice - For better archiving and compression than the built-in log4j, turn off the rotation of log4j and use logrotate.

Edit the logging.yml to limit the amount of space consumed by Elasticsearch log files in the event of an extremely high rate of error logging.
Locate the file: section and make these changes:

Before

file:
    type: dailyRollingFile
    file: ${path.logs}/${cluster.name}.log
    datePattern: "'.'yyyy-MM-dd"
...

After

file:
    type: rollingFile                        # change from dailyRollingFile
    maxBackupIndex: 0
    maxFileSize: 1000000000                  # 1 GB
    file: ${path.logs}/${cluster.name}.log
    # datePattern: "'.'yyyy-MM-dd"           # remove
...

Repeat for the deprecation and slowlog log files, as appropriate:

  deprecation_log_file:
    type: rollingFile
    file: ${path.logs}/${cluster.name}_deprecation.log
    layout:
      type: pattern
      conversionPattern: "[%d{ISO8601}][%-5p][%-25c] %m%n"
    maxBackupIndex: 0
    maxFileSize: 1000000000 # (1GB)

  index_search_slow_log_file:
    type: rollingFile
    file: ${path.logs}/${cluster.name}_index_search_slowlog.log
    layout:
      type: pattern
      conversionPattern: "[%d{ISO8601}][%-5p][%-25c] %m%n"
    maxBackupIndex: 0
    maxFileSize: 1000000000 # (1GB)

  index_indexing_slow_log_file:
    type: rollingFile
    file: ${path.logs}/${cluster.name}_index_indexing_slowlog.log
    layout:
      type: pattern
      conversionPattern: "[%d{ISO8601}][%-5p][%-25c] %m%n"
    maxBackupIndex: 0
    maxFileSize: 1000000000 # (1GB)

Add a script to manage the log rotation.
Sample contents of a logrotate.d script (default location: /etc/logrotate.d/elasticsearch):

/var/log/elasticsearch/*.log
{
        weekly
        rotate 8
        size 512M
        compress
        missingok
        copytruncate
}

Configuration is complete. Resume the Elasticsearch installation:

Installing Elasticsearch