Hardware problem requiring node replacement

In case one of the nodes suffers a difficult to repair situation (lost motherboard for example, or lost disk with no RAID), it might become necessary to replace the server with a blank one.

From the cluster point of view, we will need to remove the old node and add the new one, for both corosync/pacemaker and ceph.

  1. The ansible/playbooks/replace_machine_remove_machine_from_cluster.yaml playbook can remove a node from the cluster. For this, the machine_to_remove should be set to the hostname to remove.
    The below command should be launch in the ansible project.

    cqfd run ansible-playbook -i /path/to/inventory.yaml -e machine_to_remove=HOSTNAME playbooks/replace_machine_remove_machine_from_cluster.yaml
  2. A new hosts should be install with the ISO installer and the same hostname, ip address, etc... than the old node.
  3. Make the "cluster network" connections between hosts.
  4. Restart the cluster_setup_debian.yml playbook to configure the new host in the cluster (more details here).


  • No labels