To migrate to new machine images in Hashicorp’s Nomad, we can’t do a git-ops approach easily. It requires manual cluster management.
From a high level, we need to bring up a parallel set of nodes, get them configured correctly, and then turn off the original set. This is made more difficult because both consul and nomad use raft consensus, so we have to manage that state and who the leader is carefully, else we break the cluster which causes more manual intervention.
First, bring up a new set of nodes. In terraform, there are two instances of application platform
, blue and green. One is active and has nodes (server_count
parameter controls this), the other doesn’t. Add nodes to the one that doesn’t have any, along with the updated machine image. Once that completes, you should have new instances up using the new machine image.
Validate things are running correctly and they have joined the peer list:
consul operator raft list-peers
consul members # this one also includes the client instances
nomad operator raft list-peers
nomad server members
nomad node status # shows client instances
consul maint -enable -reason=""
will remove a node from consul’s raft peer list and remove it from DNS queries. Do this on a node before you delete it. Note that this is a stateful flag. The server won’t go out of maintenance mode until you disable maintenance mode. That’s fine if you’re deleting the node.
From there, we need to delete the old servers.
After that’s done, tell the other peers the old nomad servers should be removed from the peer list. You can get the list of servers with nomad server members
and then remove any that shouldn’t be there with nomad server force-leave green-test-server-0
. If that server is still up, it’ll rejoin.
For servers, we need to ensure our servers are in the raft consensus and remove the old ones.
$ nomad operator raft list-peers
Node ID Address State Voter RaftProtocol
test-server-0.global b3f3dd36-0c73-4a2b-ec22-46f86e07ef62 192.168.87.2:4647 follower true 3
test-server-1.global a0471d3e-ee9b-1153-8e9d-46d35bd6ce26 192.168.87.3:4647 follower true 3
test-server-2.global 779beb08-25f9-1219-635c-6880cf5186e6 192.168.87.4:4647 leader true 3
green-test-server-1.global 03313bb6-08e7-73ac-26c0-84b635750929 192.168.87.6:4647 follower true 3
green-test-server-2.global c8a984a6-346a-3e50-9b20-00a88ffde8a7 192.168.87.7:4647 follower true 3
$ nomad operator raft remove-peer -peer-id 03313bb6-08e7-73ac-26c0-84b635750929
Removed peer with id "03313bb6-08e7-73ac-26c0-84b635750929"
$ nomad operator raft list-peers
Node ID Address State Voter RaftProtocol
test-server-0.global b3f3dd36-0c73-4a2b-ec22-46f86e07ef62 192.168.87.2:4647 follower true 3
test-server-1.global a0471d3e-ee9b-1153-8e9d-46d35bd6ce26 192.168.87.3:4647 follower true 3
test-server-2.global 779beb08-25f9-1219-635c-6880cf5186e6 192.168.87.4:4647 leader true 3
If the servers keep coming back, you’ll need to turn off nomad on those services by doing systemctl stop nomad
.
Drain the “client” server. This may take a bit.
nomad node drain -enable -yes 46f1
46f1
is the start to the node id.
Verify the node is ineligible for scheduling.
$ nomad node status
ID Node Pool DC Name Class Drain Eligibility Status
0c0369a7 <none> dc1 test-client-1 service false eligible ready
1080c9d5 <none> dc1 test-client-0 ingress false eligible ready
e4567d4b <none> dc1 test-client-3 service false eligible ready
6dcc189d <none> dc1 test-client-4 service false eligible ready
74144982 <none> dc1 test-client-2 service false eligible ready
You can mark a node for ineligibilty for scheduling before draining, so old workloads don’t get put onto a server which is about to be turned off.
nomad node eligibility -disable 46f1