Swarm cluster stuck in "Initializing" state on Partial Restart after full cluster Shutdown

Problem

After performing a shutdown of a Swarm cluster, node processes on subsequent start display volumes as mounted but stay stuck in “Initializing” state and never reach “OK” state.

Solution

This usually occurs when a large cluster has been shut down and only partially restarted. The reason for this is that the node processes that are running were not able to read the persistent settings stream (PSS). This is likely due to the fact that the PSS exists on cluster servers that haven’t been started yet. In this situation, make sure that all of the servers for the cluster have been started back up so that the PSS is available to all nodes in the cluster.

Note that the default number of copies of the PSS is 16, which is why this situation is most likely to occur in partial restart of large clusters.

If you still encounter this issue after all of the servers in the cluster have been started, it’s recommended to open a ticket with DataCore Support for further inspection.

© DataCore Software Corporation. · https://www.datacore.com · All rights reserved.