NOVA Lab: Rancher Cluster Part 1: Virtual Machine Setup

Today’s project is to install Rancher ahead of getting my new RPI and MiniPC Cluster online.

To start this project, I’m going to use S04 which is an ESXi 7.0.3 server with 2 NVMe, 4 HDs and 2 NAS connections.

Step 1: Create Virtual Machines

I will want 5 Ubuntu 22.04 LTS Virtual Machines for this setup.
The first three are for the controllers and the last two are worker nodes. Eventually, I’ll install Longhorn on this server too for storage, so for now, I’m going to locate all five of these servers on the NVMe.
Each of these will have 4 virtual CPUs, 4Gb RAM and 256Gb Thinly Provisioned Storage.

For each of these servers, I’ll do the basic default minimal installation of Ubuntu.

After each of the boxes come on line, I’ll setup a DHCP reservation of 10.0.0.31-35/22.

Before moving on, I’ve confirmed that I have five virtual machines with predictable IP addresses.

Step 2: Add SSH Keys

Based on my last lab, I want to get these new virtual machines into Ansible. I’ll use that to automate my installation of K3S.

First step is to copy my SSH key to all of these servers.

console@ansible:~$ ssh-copy-id console@10.0.0.31
/usr/bin/ssh-copy-id: INFO: Source of key(s) to be installed: "/home/console/.ssh/id_rsa.pub"
The authenticity of host '10.0.0.31 (10.0.0.31)' can't be established.
ED25519 key fingerprint is SHA256:
This key is not known by any other names
Are you sure you want to continue connecting (yes/no/[fingerprint])? yes
/usr/bin/ssh-copy-id: INFO: attempting to log in with the new key(s), to filter out any that are already installed
/usr/bin/ssh-copy-id: INFO: 1 key(s) remain to be installed -- if you are prompted now it is to install the new keys
console@10.0.0.31's password:

Number of key(s) added: 1

Now try logging into the machine, with:   "ssh 'console@10.0.0.31'"
and check to make sure that only the key(s) you wanted were added.

console@ansible:~$ ssh-copy-id console@10.0.0.32
/usr/bin/ssh-copy-id: INFO: Source of key(s) to be installed: "/home/console/.ssh/id_rsa.pub"
The authenticity of host '10.0.0.32 (10.0.0.32)' can't be established.
ED25519 key fingerprint is SHA256:
This key is not known by any other names
Are you sure you want to continue connecting (yes/no/[fingerprint])? yes
/usr/bin/ssh-copy-id: INFO: attempting to log in with the new key(s), to filter out any that are already installed
/usr/bin/ssh-copy-id: INFO: 1 key(s) remain to be installed -- if you are prompted now it is to install the new keys
console@10.0.0.32's password:

Number of key(s) added: 1

Now try logging into the machine, with:   "ssh 'console@10.0.0.32'"
and check to make sure that only the key(s) you wanted were added.

console@ansible:~$ ssh-copy-id console@10.0.0.33
/usr/bin/ssh-copy-id: INFO: Source of key(s) to be installed: "/home/console/.ssh/id_rsa.pub"
The authenticity of host '10.0.0.33 (10.0.0.33)' can't be established.
ED25519 key fingerprint is SHA256:
This key is not known by any other names
Are you sure you want to continue connecting (yes/no/[fingerprint])? yes
/usr/bin/ssh-copy-id: INFO: attempting to log in with the new key(s), to filter out any that are already installed
/usr/bin/ssh-copy-id: INFO: 1 key(s) remain to be installed -- if you are prompted now it is to install the new keys
console@10.0.0.33's password:

Number of key(s) added: 1

Now try logging into the machine, with:   "ssh 'console@10.0.0.33'"
and check to make sure that only the key(s) you wanted were added.

console@ansible:~$ ssh-copy-id console@10.0.0.34
/usr/bin/ssh-copy-id: INFO: Source of key(s) to be installed: "/home/console/.ssh/id_rsa.pub"
The authenticity of host '10.0.0.34 (10.0.0.34)' can't be established.
ED25519 key fingerprint is SHA256:
This key is not known by any other names
Are you sure you want to continue connecting (yes/no/[fingerprint])? yes
/usr/bin/ssh-copy-id: INFO: attempting to log in with the new key(s), to filter out any that are already installed
/usr/bin/ssh-copy-id: INFO: 1 key(s) remain to be installed -- if you are prompted now it is to install the new keys
console@10.0.0.34's password:

Number of key(s) added: 1

Now try logging into the machine, with:   "ssh 'console@10.0.0.34'"
and check to make sure that only the key(s) you wanted were added.

console@ansible:~$ ssh-copy-id console@10.0.0.35
/usr/bin/ssh-copy-id: INFO: Source of key(s) to be installed: "/home/console/.ssh/id_rsa.pub"
The authenticity of host '10.0.0.35 (10.0.0.35)' can't be established.
ED25519 key fingerprint is SHA256:
This key is not known by any other names
Are you sure you want to continue connecting (yes/no/[fingerprint])? yes
/usr/bin/ssh-copy-id: INFO: attempting to log in with the new key(s), to filter out any that are already installed
/usr/bin/ssh-copy-id: INFO: 1 key(s) remain to be installed -- if you are prompted now it is to install the new keys
console@10.0.0.35's password:

Number of key(s) added: 1

Now try logging into the machine, with:   "ssh 'console@10.0.0.35'"
and check to make sure that only the key(s) you wanted were added.

I recognize that I need to add DNS records for these, so I’ll add those now in the PIHole server.

Step 3: Add Ansible

With that complete, I need to add the new servers to my Ansible Inventory.

nano ./inventory/hosts

And here is the new inventory file.

[nameservers]
ns1.ttgb.us
ns2.ttgb.us

[rpi]
rpi-node1.ttgb.us
rpi-node2.ttgb.us
rpi-node3.ttgb.us
rpi-node4.ttgb.us

[s04_cluster]
s04-node1.ttgb.us
s04-node2.ttgb.us
s04-node3.ttgb.us
s04-node4.ttgb.us
s04-node5.ttgb.us

A quick reminder that when I initially connected to these hosts, it was via IP address and I am now using a DNS FQDN. This means that the host fingerprint check didn’t match when I did the Ansible ping.

This is an expected problem and I just answered ‘yes’ when asked if I wanted to update.

Here was the output when I pinged the servers in Ansible.

console@ansible:~$ ansible -i ./inventory/hosts s04_cluster -m ping
s04-node2.ttgb.us | SUCCESS => {
    "ansible_facts": {
        "discovered_interpreter_python": "/usr/bin/python3"
    },
    "changed": false,
    "ping": "pong"
}
s04-node3.ttgb.us | SUCCESS => {
    "ansible_facts": {
        "discovered_interpreter_python": "/usr/bin/python3"
    },
    "changed": false,
    "ping": "pong"
}
s04-node1.ttgb.us | SUCCESS => {
    "ansible_facts": {
        "discovered_interpreter_python": "/usr/bin/python3"
    },
    "changed": false,
    "ping": "pong"
}
s04-node4.ttgb.us | SUCCESS => {
    "ansible_facts": {
        "discovered_interpreter_python": "/usr/bin/python3"
    },
    "changed": false,
    "ping": "pong"
}
s04-node5.ttgb.us | SUCCESS => {
    "ansible_facts": {
        "discovered_interpreter_python": "/usr/bin/python3"
    },
    "changed": false,
    "ping": "pong"
}

Step 4: Run some playbooks

Before I get too far in, I want to make sure I run some normal scripts via Ansible to make sure everything is up to date.

console@ansible:~$ ansible-playbook ./playbooks/ubuntu_upgrade.yml -i ./inventory/hosts --ask-become-pass
BECOME password:

PLAY [*] *********************************************************************************

TASK [Gathering Facts] *********************************************************************************
ok: [ns2.ttgb.us]
ok: [ns1.ttgb.us]
ok: [s04-node1.ttgb.us]
ok: [s04-node2.ttgb.us]
ok: [rpi-node2.ttgb.us]
ok: [rpi-node3.ttgb.us]
ok: [rpi-node1.ttgb.us]
ok: [s04-node3.ttgb.us]
ok: [s04-node4.ttgb.us]
ok: [s04-node5.ttgb.us]
ok: [rpi-node4.ttgb.us]

TASK [apt] *********************************************************************************
ok: [ns2.ttgb.us]
ok: [ns1.ttgb.us]
changed: [s04-node1.ttgb.us]
changed: [rpi-node4.ttgb.us]
changed: [s04-node2.ttgb.us]
changed: [rpi-node1.ttgb.us]
changed: [rpi-node2.ttgb.us]
changed: [rpi-node3.ttgb.us]
changed: [s04-node3.ttgb.us]
changed: [s04-node4.ttgb.us]
changed: [s04-node5.ttgb.us]

PLAY RECAP *********************************************************************************
ns1.ttgb.us                : ok=2    changed=0    unreachable=0    failed=0    skipped=0    rescued=0    ignored=0
ns2.ttgb.us                : ok=2    changed=0    unreachable=0    failed=0    skipped=0    rescued=0    ignored=0
rpi-node1.ttgb.us          : ok=2    changed=1    unreachable=0    failed=0    skipped=0    rescued=0    ignored=0
rpi-node2.ttgb.us          : ok=2    changed=1    unreachable=0    failed=0    skipped=0    rescued=0    ignored=0
rpi-node3.ttgb.us          : ok=2    changed=1    unreachable=0    failed=0    skipped=0    rescued=0    ignored=0
rpi-node4.ttgb.us          : ok=2    changed=1    unreachable=0    failed=0    skipped=0    rescued=0    ignored=0
s04-node1.ttgb.us          : ok=2    changed=1    unreachable=0    failed=0    skipped=0    rescued=0    ignored=0
s04-node2.ttgb.us          : ok=2    changed=1    unreachable=0    failed=0    skipped=0    rescued=0    ignored=0
s04-node3.ttgb.us          : ok=2    changed=1    unreachable=0    failed=0    skipped=0    rescued=0    ignored=0
s04-node4.ttgb.us          : ok=2    changed=1    unreachable=0    failed=0    skipped=0    rescued=0    ignored=0
s04-node5.ttgb.us          : ok=2    changed=1    unreachable=0    failed=0    skipped=0    rescued=0    ignored=0

From this we can see that the RPI and S04 nodes were all changed.
This playbook was a simple APT Update / APT Upgrade command, so no worries with the output here.

Based on my last lab, I also want to run the ubuntu_basic_tools.yml and ubuntu_timezone.yml files.
Check out that lab if you want more details, but this is really optional and not necessary for this lab.

OK. So this part is done. I’m going to snapshot these servers before I move on, just in case the automation doesn’t work. 🙂

And all done!

So what have we accomplished so far?

  1. Setup 5 virtual machines on VMware ESXi.
  2. Added each of the servers to DNS.
  3. Added my Ansible SSH key to each of the servers and tested logon.
  4. Connected to each server via Ansible and ran some test playbooks.

Leave a Reply