Running DataCore Swarm using the caringo:demo containers

These are instructions for using docker-compose to deploy a DataCore Swarm environment in containers. The commands below set up a complete environment on a single server or laptop for demonstration purposes or for functional integration testing.

For more information see:
https://www.brighttalk.com/webcast/13173/413805/cloud-seeding-with-object-storage-containers-tech-tuesday-webinar

Prepare the Docker Host

Install Docker on a Linux server (e.g. using the convenience script https://docs.docker.com/engine/install/centos/#install-using-the-convenience-script ) or install Docker for Desktop on Windows or macOS (https://docs.docker.com/engine/install/#desktop ).

a. Verify the docker server has at least 8GB RAM available to containers (check Resources in Docker for Desktop) and 40GB disk space available.

b. Verify the sysctl value vm.max_map_count = 262144 – it is required for the elasticsearch containers to start.

Verify the docker server has good sysctl settings.

docker run --privileged centos sysctl -a | grep -E 'file-max|max_user_instances|max_user_watches|max_map_count'

fs.file-max = 131072
fs.inotify.max_user_instances = 128
vm.max_map_count = 262144

docker run --rm --privileged centos:7.9.2009 free -h

total used free shared buff/cache available
Mem: 7.7G 263M 6.7G 163M 778M 7.0G
Swap: 1.0G 187M 836M

docker run --rm --privileged centos:7.9.2009 df -h .

Filesystem Size Used Avail Use% Mounted on
overlay 59G 7.3G 49G 14% /

The default vm.max_map_count on macOS is fine but Windows and Linux users have to adjust. Temporarily make the change with this but unfortunately this must be performed every time a Windows machine is rebooted:

Linux can make this change permanent by creating this file as root and rebooting and installing docker-ce:

c. This is no longer necessary with Swarm 15, but if you run an older version like Swarm 11.3.0 (use image caringo:v11 instead of caringo:demo) be sure docker info shows Cgroup Version: 1 . Recent Docker for Desktop and recent Linux (e.g. ubuntu 22) default to Cgroup Version 2 which causes WMM05 unexpected memory stats errors in castor.log prior to Swarm 15.
Just restart Docker for Desktop after changing "deprecatedCgroupv1": true, in ~/Library/Group Containers/group.com.docker/settings.json or restart Linux after setting GRUB_CMDLINE_LINUX="systemd.unified_cgroup_hierarchy=0" in /etc/default/grub.

Download and Initialize the DataCore Swarm Containers

The below bash commands contain the access and secret keys needed to access DataCore’s public image repo, now at quay.io/perifery/. Further instructions and files for an offline install are located at:
https://jam.cloud.caringo.com/public/offline-demo/README.md

Note: DOCKER_INTERFACE is the IP or hostname of the docker server, so localhost if running Docker for Mac.

For Linux, macOS, or WSL2 (running with elevated privileges, or as root):

Login Succeeded (ignore the WARNING about using --password)

demo: Pulling from perifery/caringo
Digest: sha256:222449b510c6a9d680fa95bdead0a50ee2a0291416016e4aa8e1d5b6a4713be6
Status: Downloaded newer image for quay.io/perifery/caringo:demo
quay.io/perifery/caringo:demo

The init.sh script outputs any errors. This can be checked in the container logs (e.g. docker logs caringo42_elasticsearch_1). When successful it outputs the URLs for accessing the Swarm storage console and content portal e.g.
Content Portal: http://localhost/_admin/portal
Storage UI: http://localhost:91/_admin/storage
Swarm legacy console: http://localhost:4290/storage/swarm/
Grafana dashboards: http://localhost:4230/

Note the GATEWAY_ADMIN_USER:GATEWAY_ADMIN_PASSWORD now default to dcadmin:datacore.

Use caringo:demo-min instead of caringo:demo if the machine has only 8GB RAM. That lowers some memory settings and simplifies elasticsearch.

Swarm on arm64 (EXPERIMENTAL)

Although Swarm is not supported on ARM64 there is an experimental build available and the other containers have been built for arm64. Thus there is experimental support for running the demo containers on a non-Intel Mac. Note the REGISTRY_URL is different from above (multi-arch images are not yet being used). The caringo:alpha image is specified to pull upcoming releases and a more recent elasticsearch version.


Next Steps to Attempt

  • A mini DataCore Swarm environment is now running. Use Content Portal and Storage UI as with a production environment.

  • Exec into this container that has a few S3 clients installed and configured:

  • Configure an external S3 client to use this environment. An /etc/hosts (or\WINDOWS\system32\drivers\etc\hosts) entry is needed on the S3 client machine to map the domain backup42 to the IP of the machine running docker. Use 127.0.0.1 if using Docker for Desktop and the S3 client is on the local machine.

    Create a different domain using Portal, or set docker run ... -e DOMAIN=mylaptop.example.com ... init.sh to change the name of the domain the init script creates.

  • All logs in the syslog container are visible and support tools like swarmctl can be run to see or change swarm settings or run indexer-enumerator.sh to list all objects.

  • Bring up an existing environment after a reboot or stopped with docker run … stop.sh using docker run … up.sh. Use the setting docker run -e PROJECT_RESTART=always … init.sh to automatically start on reboot.

  • If the docker server has a service already using ports 80 and 443, resulting in “ERROR: for caringo42_https_1 Cannot start service … 0.0.0.0:443: bind: address already in use”, change those published ports by adding:

  • Put any configuration to reuse into a text file and use --env-file my.env to simplify the docker runcommands.

     

    WARNING: changing the Swarm cluster.name loses the “persistent settings UUID”, including the Search Feed. It needs to be recreated by running re-run init.sh.

  • The default 2TB license is sufficient. To use a license devlicense.txt add SWARM_CFG_1=license.url = file:///license/devlicense.txt to the my.env and copy the license to a volume shared to the syslog and swarm containers.

    • Bring up the "syslog" service, it has the new license volume, using "--pull always" to download the latest caringo:demo image.

    • Copy the license file into the volume in the syslog container.

    • Now rerun "up.sh" so swarm comes up with the new "license.url" setting and license volume.

  • Add this to the my.env to allow anonymous read and write access to Gateway, e.g. to test an application that makes requests directly to Swarm. This assumes the docker environment is only accessible by trusted clients.

  • See all variables used to configure this environment and run docker-compose directly in the test container.

    ...shows non-default container config...

  • Use the images without creating Swarm, e.g. to run a support tool:

  • Add -e ADD_COMPOSE_FILE=:docker-compose-systemd.yml (the colon prefix is required) to make the gateway and elasticsearch containers use systemd, to more closely match a regular environment.
    This currently requires "deprecatedCgroupv1": true in ~/Library/Group Containers/group.com.docker/settings.json.

  • Run multiple gateways behind the haproxy load balancer by adding -e GATEWAY_SCALE=3.

  • Add these environment variables to bring up an environment that does not use elasticsearch. This means no Search Feed is created and object listings, Swarm metrics and Gateway metering and quotas are disabled.
    -e ELASTICSEARCH_SCALE=0 -e ESHOST= -e INDEXER_HOSTS= -e SKIP_VERIFY_ELASTICSEARCH=true -e GATEWAY_METERING=false

  • Use tcpdump to monitor the http traffic to/from the Gateway S3 port:

     

     

  • Remove the environment, deleting all containers and volume and reclaiming any space it used with clean.sh:

 

© DataCore Software Corporation. · https://www.datacore.com · All rights reserved.