Most of the bare metal hardware that I manage now supports or defaults
to UEFI. Many have the option to use “Legacy BIOS” mode, but the main
feature I find that I require from UEFI is support for the boot volume
to be 2TB+. I prefer one single RAID0 volume for all the builders for
operational simplicity.
Foreman
My preferred solution for installing the base OS on the hardware
is Foreman. It makes automated installs very simple and
reproducible but has only recently supported UEFI and PXE. I will
describe my previous attempts to get this working and how I was able
to get it working with Foreman 1.14.2.
Pxelinux and UEFI
Pxelinux is part of the syslinux project and provides many
different types of bootloaders. Pxelinux depends on a custom ROM
inside the network card to run DHCP and download kernel+initrd using
TFTP. It also has support for displaying interactive menus to the
user.
UEFI contains all this functionality but unfortunately did not think
to extend it or preserve backwards compatibility. All boot time
programs like grub2 and pxelinux required significant rework. I was
able to use the syslinux git tree and compile a working EFI version of
pxelinux that was able to boot the 14.04 Ubuntu installer. But there
were limitations:
- Foreman only supported non-efi Pxelinux and I had to manually swap
the binaries on the TFTP server
- The menu system didn’t work so I could not use the Foreman feature
of leaving the system to boot PXE by default and booting the local
hard drive if rebuild was not enabled for that host in Foreman.
- I could not get this pxelinux to work with 16.04 installer. The
initrd would be downloaded and would hang and trigger a system reset.
UEFI and GRUB2
Foreman 1.13 added support for GRUB2 and UEFI, but my initial attempts
failed. When I changed the boot template from PXELinux to PXEGRUB2 the
update of the DHCP server would fail. The DHCP entry was added
properly to the DHCP server using the Foreman Proxy, but it would
cause a traceback on the server and prevent the Host change from being
saved. This bug was fixed in 1.14 and I was finally able to get this
working. There was one more bug in the PXEGRUB2 boot template
involving an assumption about Profiles. I opened an issue and have
submitted a PR to the community templates for this.
Foreman 1.14.2 was also missing the Preseed default PXEGrub2 template,
but one had already been submitted to the community templates repo, so
I had to manually add this template to my provisioning templates.
TFTP preparation
Foreman adds a DHCP record which contains the following:
server.filename = "grub2/grubx64.efi";
First step was to find the proper grub2 binary. Fortunately the Ubuntu
wiki had a helpful post covering UEFI PXE netboot
I was able to find the xenial grubnetx64.efi here. But it turns
out the Debian/Ubuntu grub2 is missing a few useful features that have
been added to the Fedora grub2. The Ubuntu/vanilla grub2 only looks
for grub/grub.conf whereas the Fedora grub2 has patches to search the
grub2 directory and search for grub.cfg-[mac address] which is a
convention that Foreman expects. Since Foreman is a project mostly run
by RedHat employees it makes sense. The Fedora prebuilt grub2
bootloader is here.
There is PR which adds a default grub/grub.cfg and uses the grub2
regexp feature to search for $prefix/grub.cfg-[mac address]. This
means that will support vanilla Grub2 soon.
Foreman will also place the correct kernel and initrd into the boot
directory. It will not replace an older kernel, so sometimes a newer
kernel and initrd need to be download from here and manually
added to the boot directory.
How does it work?
Here is how this works:
- Put the host in build mode. This sets up the grub2/grub.cfg- file with the automated build setup. It also adds a DHCP
entry specifying to download the "grub2/grubx64.efi" file.
- Start PXE boot and UEFI retrieves IP, filename and next-server/TFTP
from DHCP server
- UEFI downloads grub2/grubx64.efi from TFTP
- GRUB2 looks for grub2/grub.cfg-[mac address]
- Grub2 template contains the automated install configuration
generated by Foreman
- GRUB2 downloads kernel and initrd and boots the kernel and starts
the installer
- After install is complete, PXELinux template is changed back to
chainload local disk
Conclusion
The deficiencies of the previous process have been addressed. GRUB2
can boot the 16.04 kernels and even the hwe kernels and installer if I
want to. The menus and boot to local disk are working.
This book is a fictional/auto biographical account of one mans journey
on the Camino pilgrimage trail in Spain. I really enjoyed it. The
characters are quirky and very human with baggage and beautiful
experiences. The dialogue is a little too perfect, but it made me
consider doing a long walk like this.
Some of my Favorite quotes from the book:
Questions that help guide the way, like the yellow arrows on the
Camino
Too often I hear about guiding values and statements, but I really
like the idea of guiding questions. Tim Ferris and his podcast guests
often talk about questions that guided their decisions.
"If I loved myself, what would I do?"
I find this a tough question because it feels selfish. Finding a
balance between selfishness and selflessness never ends. I wish there
was a single answer, but I know that isn’t possible.
"Don't ask why, ask 'Now what'? People have made it through horrific
times not by focusing on why but moving on and asking 'Now What?'"
Trying to understand is important, but sometimes the energy is better
spent on getting ready for the future.
"It is not the wound that makes you special, it is the light that
shines through it"
A great reminder that the hardships of life define you as much as the
successes. I have always marvelled at artists that were able to
transform immense pain into incredible music and art.
"Perfect is no unnecessary pain. I wish you a perfect Camino."
Unfortunately sometimes pain is necessary. Pain is such a multi
faceted concept and hard to talk about. Maybe I will find someone who
can do it more eloquently than I can.
Rating: Recommended
I have been planning to upgrade my infrastructure to Puppet 4 but
other priorities have delayed it. I was finally able to find a way to
start the upgrade work. There are many new pieces of technology
available which I hope will make things work even better than before.
Puppet 4
Since Puppet 3 is End Of Life at the end of 2016, this upgrade is
probably the most urgent. I am looking forward to being able to use
the improved Puppet language and r10k. The Puppet Server is supposed
to be much faster and the AIO packages should be easier to install and
support.
MCollective Choria
R.I.Pienaar has been busy and built a new mcollective deployment package
called Choria. It has puppet modules which automatically enables
SSL everywhere, has an audit plugin, a packager for plugins and uses
NATS instead of ActiveMQ. My federated cluster with three ActiveMQ
servers has been stable, but it was a pain to setup and
upgrade. It is also managed using a custom puppet module which I do
not want to maintain. I am also hoping to be able to use NATS as a
message bus for some application orchestration.
Gitolite
I maintain a large internal network of git servers. The base
configuration is very open and anyone with a valid ssh login using NIS
can create or push to repositories. Every repository is available for
unauthenticated read-only access. We have a few post-receive hooks to
limit who can push to what repositories, but our developers respect
our gatekeeper model and do not push to repositories they aren’t
supposed to. The open access model has allowed people to do emergency
fixes when necessary. But there has occasionally been requests for
some sort of access control and I also have considered locking down
the repository with the Puppet modules because it is so critical to
the business, so I decided to experiment with Gitolite.
R10K
I have been using librarian-puppet with a custom git
synchronization program which relies on the ActiveMQ network. I have 3
puppet masters and the post-receive hook uses STOMP to broadcast
changes. The git-stomp-hook receives the broadcast and calls
librarian-puppet as appropriate. This has worked well except for when
the ActiveMQ network was having problems. So I was happy to notice
that the puppet-r10k module contains a webhook program that can
be used to trigger r10k deploy on the puppet masters. Since r10k was
integrated into Puppet Enterprise, I decided to move away from
librarian-puppet. R10k actually works very similarly to the solution I
had cobbled together, it just ignores module dependencies. This is
both a blessing and a curse, but because Puppet does not support
conditional dependencies it may be better long term to manage
dependencies manually.
Bootstrapping a Puppet Server
I manage my Puppet 3 server using Puppet and the bootstrap process is
tricky. Given a machine with just the puppet agent, how to get the
Puppet Server + Hiera + R10K and my control repo installed in a
reproducible way. I started with the Puppetlabs control-repo
skeleton which gave me the basics, but no bootstrap. I looked through
a lot of repos and finally found puppetinabox control-repo
by rnelson0. This repo uses a script to install the bootstrap
modules locally and puppet apply with some simple puppet manifests to
do the bootstrap. I decided to use this approach as well.
A Puppet module to manage Puppet
Next step was to choose a module to manage the Puppet server. I
reviewed many but many had crazy dependencies or didn’t support the
way I wanted to configure my systems. I ended up using
the puppet-puppet module maintained by the Foreman
team. It is a big module, but it supports:
- Puppet agent run using cron
- Puppet server setup on Ubuntu 16.04
- Compatible with Puppetdb and r10k
- Foreman integration
I did add Foreman integration to my Puppet3 module, so having that was
interesting to me.
Scripting the bootstrap
The bootstrap script does the following:
- Make sure git is installed
- Clone all the required modules into a bootstrap directory. I make
internal git mirrors of all the puppet modules I use.
- Run puppet apply using 3 manifests to install puppet server, hiera
and r10k.
- Run r10k deploy to generate the local production environment.
I now had a server setup and could start creating roles and profiles
to manage the server.
R10K Webhook
Redundancy in infrastructure is good and having two ways to synchronize
the environments on the masters is also a good idea. I could use
mcollective, but that hasn’t been setup yet. The puppet-r10k
module comes with a webhook. This webhook is a small ruby sinatra
application that listens for http connections and triggers r10k
commands as appropriate. It supports GitHub, GitLab, Bitbucket,
etc. but I don’t need those. Since I am using a local gitolite server
I created a git post-receive hook that calls curl with the updated
branch:
curl -d "{ \"ref\": \"$REFNAME\" }" -H "Accept: application/json" \
"https://puppet:puppet@$HOST:8088/payload" -k -q
By default the webhook and r10k run as root which is something I try
to avoid. I was able to change the user for the webhook to puppet,
chown all the r10k cache and environment dirs to puppet user and
everything works. It also uses the SSL certs as signed by the Puppet
CA to encrypt the communication.
Bash post receive hook and subshells
The only problem with this approach is that the user much wait when
running git push for the script to complete. I was able to run the
curl command in a subshell and have the post-receive script exit
quickly.
( trigger_webhook "$refname" <hostname> ) &
This code will run to completion even once the parent shell has
exited. The logs of the synchronization are stored on the puppet
master. They could be stored on the git server as well but that isn’t
necessary.
Next steps
Install MCollective Choria and start porting the base configuration
with ntp, ssh keys, package management, etc. to Puppet 4.