Installing Kubernetes on Baremetal via CoreOS Tectonic with Grub Booting
At work, we use kube-aws to deploy our Kubernetes clusters running on top of Container Linux inside of AWS. I wanted to be able try new things with Kubernetes in my personal lab without having to rack up huge AWS bills, so that means figuring out a good way to deploy Kubernetes to baremetal in the least painful way. I wanted to try Tectonic because it offers a simplified graphical installer, and that means I also need to install Matchbox to support the baremetal provisioning aspects.
After reading through the entire Tectonic Installation Guide, I realized that it doesn’t cover some of the underlying OS provisioning components as they vary per environment. Since I don’t have that infrastructure as I’m starting from scratch, I had to get that going before continuing the installation. Here’s my summary of steps taken (including a custom dnsmasq container) to round out a full working guide.
Architecture Overview
My personal lab is a mixture of PCs, 1U servers, and Mac hardware. So, for this lab, I’m going to pick one of each to ensure the entire process would work on all my hardware.
As per the installation guide, Tectonic needs three systems at a minimum. Here are my systems per role:
- Deployment/Provisioning system
deploy.lab
-172.22.10.2/24
- Kubernetes Controller
mp.lab
-172.22.10.50/24
- Kubernetes Worker
smpc.lab
-172.22.10.54/24
Network Diagram
The simplistic network environment is as follows:
Detailed PXE/Netboot to CoreOS Installation Flow
The following describes the complete baremetal provisioning process from DHCP request to CoreOS installation. Configuring the provisioning system to support this workflow is described in the subsequent section.
- Using IPMI or a keyboard during booting, tell the system to “boot from network”. This will cause the NIC to perform a DHCP request and look for TFTP related settings to boot from. In my case, it was
F12
for my server and holdingn
during the boot chime for the Mac. - The DHCP server responds with an IP and subnet along with information pointing to the TFTP server and a filename of what to download/run from that TFTP server.
- The system attempts to connect to the TFTP server and download/run that initial file. In the case of
grub2
, that initial boot file runs and then also tries to contact the same TFTP server looking for a grub boot configuration. - If a grub boot configuration file is found, it follows that configuration. In the case of
matchbox
, it should be a pointing to its web port and passing the NIC’s MAC address:
- Since the initial grub configuration does nothing but load an HTTP module and defer to a web address for the rest of the
grub
configuration, it provides a convenient way to grab a system-specific boot configuration without having to change your TFTP provided configuration file. In this case,Matchbox
answers to web requests at the/grub?mac=XX:XX:XX:XX:XX:XX
URL with a tailored boot configuration, like so:
- Now that grub knows what to boot, where to get it, and extra kernel parameters for the ignition configuration for what to install inside the OS and how, the system can begin and complete the installation. Here is the ignition configuration that the CoreOS kernel pulls from the
coreos.config.url
URL which basically says to install CoreOS Container Linux and reboot:
- Here are the URL-decoded contents of
/opt/installer
from inside the ignition configuration above. Notice how it uses theos=installed
parameter to pull a “normal” boot configuration specific to this machine for future booting after being installed to disk/dev/sda
:
- Also, if you notice the name of the user available via SSH during that first boot/installation is
debug
and uses the same SSH key as what will be available after the installation completes and is rebooted for the permanent usercore
. This is really handy if you are troubleshooting why the installation is failing or want to watch that process as it goes. Note that it’s really only available for a few minutes on quick systems since the installation completes so quickly.
- At this point, CoreOS/Container Linux has been installed to
/dev/sda
, a user namedcore
with an SSH key set, and has an ignition configuration that configured its systemd units. This is where Matchbox/Ignition stop and normal SSH-based administration can take over.
Provisioning Infrastructure Configuration
There are several components that run on the deploy.lab
system that all need to work in concert for the above process to be successful:
- Matchbox - The CoreOS provided container that handles the web serving of grub, CoreOS, and Igntition templates provided by the Tectonic installer.
- DHCP, DNS, TFTP, Grub Network Boot Images and Configuration - This is handled by a custom container built using
dnsmasq
.
Setting up deploy.lab
I hand installed Centos 7.3 with the minimal install and ensured that it had a recent version of Docker with SSH key authentication as the admin
user in the docker
group:
Installing and Configuring the Matchbox Container on deploy.lab
The matchbox documentation for running via docker
is a bit misleading as it actually requires several things to be completed before actually running the container.
First, create the matchbox
user and create/own the /var/lib/matchbox
directory where it will keep all the assets and profiles.
Download the matchbox
package, verify its signature, and untar it:
Run a script to grab the version(s) of CoreOS/Container Linux to the current directory and then copy them to the assets directory to be available via matchbox
on port 8080
:
Drop out of the matchbox
installation directory and run it via docker
:
Finally, verify that matchbox
is running and able to serve up your downloaded CoreOS image(s). If you see this, you should be good to go:
Installing and Configuring the DNSMasq (DNS, DHCP, TFTP, Grub) Container on deploy.lab
It’s easiest to grab a copy of the repo and build your own docker image locally.
Edit the files in files
directory as needed. Most changes are IP addresses, MAC addresses, and hostnames for your environment:
Finally, build and run the image:
Running the Tectonic Installer
With the above in place and a free license from CoreOS for Tectonic, you can now follow the Tectonic Baremetal with Graphical Installer guide having satisfied the pre-requisites–with one exception.
-
Grub-specific Gotcha
The Tectonic installer will run you through several steps of supplying configuration and will arrive at a point where it instructs you to “power on your systems” that are to be baremetal booted from the network, but it won’t work out-of-the-box. The details are in this github issue for what is happening to prevent
grub
from working by default. The good news is that there is a simple workaround. On thedeploy.lab
system, this is the default profile that the Tectonic GUI installer places into your/var/lib/matchbox/profiles
folder:
Notice the args
section. When the system is booting from the network and pulls the /grub?mac=XX:XX:XX:XX:XX:XX
configuration, this args
list is directly dropped into the kernel line. However, the uuid
and mac:hexhyp
variables are ipxe
boot environment specific. For grub2
, it actually varies slightly. To fix this, we need to make some customizations to the matchbox
groups
configuration files. I chose to make one for each system based on the mac
address selector. Notice that I now reference the coreos-install-mp
or coreos-install-smpc
profiles:
We also need to make those renamed profiles. Notice the variable $net_efinet0_dhcp_mac
for UEFI/Mac hardware and the $net_default_mac
variable for BIOS booting hardware. Also notice that I made another unique ignition template to install to /dev/sdb
instead of /dev/sda
for the smpc.lab
system:
Once the above changes have been made, you should be able to successfully PXE/Netboot the systems and continue on with the final portion of the Tectonic installer. If you run into issues, double-check your formatting of the profiles and groups as well as hitting the /grub
and /ignition
endpoints with the proper parameters to see what configurations are being provided to your systems.
Congratulations! You should now be able to hit the web UI of the Tectonic Console, use kubectl
, and ssh
into the systems using the core
user and the SSH key you supplied. I hope this helps you understand what’s going on behind a fairly sophisticated and easily customizable baremetal Container Linux and Kubernetes installation system.