Konrad Scherer
THURSDAY, 7 NOVEMBER 2019

Using AWS Session Manager to connect to machines in a private subnet

Introduction

We are experimenting with AWS as many people are. One of the first hurdles is connecting over SSH to the EC2 instances that have been created. The “standard” mechanism is to setup a Bastion host that has a restrictive “Security Group” (also known as Firewall). This Bastion host is accessible from the Internet and once the user has logged into this host they can then access other instances in the VPC.

The Bastion host has a few limitations:

  • It is exposed to the Internet: A Security Group can restrict access to specific IPs and only open port 22. This is reasonably secure, but the possibility of an exploit in the SSH server is always a possibility.
  • SSH key management: The AWS console allows for the creation of SSH keypairs that can be automatically installed on the instance which is great. If you have multiple people accessing the Bastion instance, then either everyone will have to use the same keypair (which is bad) or there needs to some other mechanism to managing the authorized_keys file on the Bastion instance. Ideally this is automated using a tool like Puppet Bolt or Ansible.

One of my weekly newsletters pointed me to aws-gate which mentioned the possibility of logging into an instance using SSH without the need for a Bastion host. This post documents my experience getting it working.

Local Requirements

On the local machine the AWS CLI must be installed. I use a python virtualenv to keep the python environment separate and avoid requiring root access.

> python3 -m venv awscli
> cd awscli
> bin/pip3 install awscli

Unfortunately it turns out the Session Manager functionality requires a special plugin which is only distributed as a deb package.

> curl "https://s3.amazonaws.com/session-manager-downloads/plugin/latest/ubuntu_64bit/session-manager-plugin.deb" -o "session-manager-plugin.deb"
> sudo apt install ./session-manager-plugin.deb

The AWS CLI requires an access key. Go the AWS console -> “My Security Credentials” and create a new Access key (or use existing credentials).

> ~/awscli/bin/aws configure
AWS Access Key ID [None]: accesskey
AWS Secret Access Key [None]: secretkey
Default region name [None]: us-west-2
Default output format [None]:

Also in the AWS EC2 console, create a new KeyPair and download the .pem file locally. I put the file in ~/.ssh and gave it 0600 permissions. Now add the following to your .ssh/config file:

# SSH over Session Manager
host i-* mi-*
ProxyCommand sh -c "~/awscli/bin/aws ssm start-session --target %h --document-name AWS-StartSSHSession --parameters 'portNumber=%p'"
IdentityFile ~/.ssh/<keypair name>.pem

AWS IAM Setup

By default an EC2 instance will not be manageable by the System Manager. Go to AWS Console -> IAM -> Roles to update the roles.

I already had a default EC2 instance role and I had to add AmazonSSMManagedInstanceCore permissions to the instance role.

Launching the Instance

According to the docs the official Ubuntu 18.04 server AMI has the SSM agent integrated and I relied on this. Finding the right AMI is really frustrating because there aren’t proper organization names attached to AMIs. The simplest is to go the Ubuntu AMI finder and search for ‘18.04 us-west-2 ebs’ and select the most recent AMI.

In the launch options:

  • choose the correct VPC with a private subnet
  • the ‘IAM Role’ with the correct permissions
  • A “Security Group” with port 22 open to you
  • Select the Keypair that was downloaded earlier and setup in your .ssh/config file.

Launch the instance and wait a while. Go to the AWS Console -> Systems Manager -> Inventory to see that the instance is running and the SSM agent is working properly.

Connecting over SSH

If everything is setup correctly grab the instance name and do the login:

> ssh ubuntu@i-014633b619400dfff
Welcome to Ubuntu 18.04.3 LTS (GNU/Linux 4.15.0-1052-aws x86_64)
<snip>
ubuntu@ip-10-0-1-193:~$

SSH access without a Bastion host is possible!




FRIDAY, 1 NOVEMBER 2019

ZFS Disk replacement on Dell R730

Introduction

I manage a bunch of Dell servers and I use OpenManage and check_openmanage to monitor for hardware failures. Recently one machine started showing the following error:

Logical Drive '/dev/sdh' [RAID-0, 3,725.50 GB] is Ready

Unfortunately “Drive is Ready” isn’t a helpful error message. So I log into the machine and check the disk:

> omreport storage vdisk controller=0 vdisk=7
Virtual Disk 7 on Controller PERC H730P Mini (Embedded)

Controller PERC H730P Mini (Embedded)
ID                                : 7
Status                            : Critical
Name                              : Virtual Disk 7
State                             : Ready

The RAID controller log shows a more helpful message:

Bad block medium error is detected at block 0x190018718 on Virtual Disk 7 on Integrated RAID Controller 1.

From experience I know that I could just clear the bad blocks, but the drive is dying and more will come. Luckily Dell will replace drives with uncorrectable errors and I received a replacement drive quickly.

Cleanly removing the drive

I know the drive is /dev/sdh, but I created the ZFS pool using drive paths. Searching /dev/disk/by-path/ gave me the correct drive.

First step is to mark the drive as offline.

> zpool offline pool 'pci-0000:03:00.0-scsi-0:2:7:0'

To make sure I replaced the correct drive I also forced it to blink:

> omconfig storage vdisk controller=0 vdisk=7 action=blink

Next came the manual step of actually replacing the drive.

Activating the new drive

After inserting the new disk I was able to determine the physical disk number and recreate the RAID-0 virtual disk.

> omconfig storage controller action=discardpreservedcache controller=0 force=enabled
> omconfig storage controller controller=0 action=createvdisk raid=r0 size=max pdisk=0:1:6

I use single drive RAID0 because I prefer that ZFS use the disks in raidz2 mode rather than using RAID6 on the controller.

Then a quick verify that the new virtual disk is using the same PCI device and drive letter and then add it back into the ZFS pool.

> omreport storage vdisk controller=0 vdisk=7
Virtual Disk 7 on Controller PERC H730P Mini (Embedded)

Controller PERC H730P Mini (Embedded)
ID                                : 7
Status                            : Ok
Name                              : Virtual Disk7
State                             : Ready
Device Name                       : /dev/sdh
> parted -s /dev/sdh mklabel gpt
> zpool replace pool 'pci-0000:03:00.0-scsi-0:2:7:0'

ZFS will add the new drive and resliver the data.

> zpool status
pool: pool
state: DEGRADED
status: One or more devices is currently being resilvered.  The pool will
        continue to function, possibly in a degraded state.
action: Wait for the resilver to complete.

It would be slightly easier if the virtual disk was handled by the RAID controller, but the rebuild would take much longer. So far ZFS on Linux has worked very well for me and I will continue to rely on it.




FRIDAY, 18 OCTOBER 2019

Building a multi node build cluster

Introduction

I now have built, deployed and managed three internal build systems that handle thousands (yes thousands) of Yocto builds daily. Each build system has its tradeoffs and requirements. The latest one I call Wrigel was specifically designed to be usable outside of WindRiver and is available on our WindRiver-OpenSourceLabs GitHub repo and on the Docker Hub. Recently there has been a lot of internal discussion about build systems and the current state of various open source projects and I will use this post to clarify my thinking.

Wrigel Design Constraints

The primary use case of Wrigel was to make it easy for a team inside or outside WindRiver to join 3-5 “spare” computers into a build cluster. For this I used a combination of Docker, Docker Swarm and Jenkins.

Docker makes it really easy to distribute preconfigured Jenkins and build container images. Thanks to the generous support of Docker Cloud all the container images required for Wrigel are built and distributed on Docker Hub.

Docker Swarm makes is really easy to join 3-5 (Docker claims up to thousands) systems together into a cluster. The best part is that Docker Compose supports using the same yaml file to run services on a single machine or distributed over a swarm. This has been ideal for developing and testing the setup.

Jenkins is an incredible piece of software with an amazing community that is used everywhere and has plugins for almost any functionality. I rely heavily on the Pipeline plugin which provides a sandboxed scripted pipeline DSL. This DSL support both single and multi-node workflows. I have abused the groovy language support to do some very complicated workflows.

I have a system that works and looks to be scalable. Of course the system has limitations. It is these limitations and the current landscape of alternatives that I have been investigating.

Wrigel Limitations

Jenkins is a great tool, but the Pipeline plugin is very specific to Jenkins. There isn’t a single other tool that can run the Jenkins Pipeline DSL. To be fair, every build tool from CircleCI to Azure Pipelines and Tekton also have their own syntax and lock-in. There are many kinds of lock-in and not all are bad. One of the perennial challenges with all build systems has been reproducing the build environment outside of the build system. Failures due to some special build system state tend to make developers really unhappy, so I wanted to explore what running a pipeline outside of a build system would look like. I acknowledge the paradox of building a system to run pipelines that also supports running pipelines outside of the system.

The other limitation is security. The constant stream of CVE reports and fixes for Jenkins and its plugins is surprising. I am very impressed with Cloudbees and the community with the way they are taking these problems seriously. Cloudbees has made significant progress improving the default Jenkins security settings. This is no small feat considering Jenkins has a very old codebase. On the downside my own attempts to secure the default setup have been broken by Jenkins upgrades three times in the last year. While I understand the churn I am reluctant to ship Jenkins as part of a potential commercial product because each CVE would impose additional non business value work on our team.

Docker and the root access problem

Docker is an amazing tool and has completely transformed the way I work. One major problem is that giving a build script access to run Docker is equivalent to giving root on the machine. Since most build clusters are internal systems running mostly trusted code it isn’t a huge problem, but I have always been interested in alternatives. Recently Podman and rootless Docker have announced support for user namespaces. I was able to do a Yocto build using Podman and user namespaces with the 4.18 kernel so huge progress has been made. I would prefer that the build system required as little root access as possible, so I will continue to investigate using rootless Podman and/or Docker.

Breaking down the problem

At its core, Jenkins is a cluster manager and a batch job scheduler. It is also a plugin manager, but that isn’t directly relevant to this discussion. For a long time Jenkins was probably the most common open source cluster manager. It is only recently with rise of datacenter scale computers that more sophisticated cluster managers have become available. In 2019 the major open source cluster managers are Kubernetes, Nomad, Mesos + Marathon and Docker Swarm. Where Jenkins is designed around batch jobs with an expected end time, newer cluster managers are designed around the needs of a long lived service. These managers have support for batch jobs, but it isn’t the primary abstraction. They also have many features that Jenkins does not:

  • Each job specifies its resource requirements. Jenkins only supports label selectors for choosing hosts
  • The jobs are packed to maximize utilization of the systems. Jenkins by default will pack on a single machine and will prefer to reuse workareas.
  • Each manager supports high availability configurations in the open source version whereas the HA feature for Jenkins is an Enterprise only feature
  • Jobs can specify complex affinities and constraints on where the jobs can run.
  • Each manager has integration with various container runtimes, storage and network plugins. Jenkins has integration with Docker but generally doesn’t manage storage or network settings.

So by comparison Jenkins looks like a very limited scheduler, but it does have pipeline support which none of the other projects does. So I started exploring projects that add pipeline support to these schedulers. I found many very new projects like Argo and Tekton for Kubernetes, There are plugins for Jenkins that allow it to use Kubernetes, Nomad or Mesos, but they can’t really take advantage of all the features.

Cluster manager comparison

Now I will compare the features of the cluster managers which I feel are most relevant to build cluster setup:

  • How easy is the setup and maintenance?
  • How complicated is the HA setup?
  • Can it be run across multiple datacenters, i.e. Federated?
  • Community and Industry support?

Docker Swarm:

  • Very easy setup
  • Automatic cert creation and rotation
  • transparent overlay network setup
  • HA easy to setup
  • no WAN support
  • Docker Inc. is focused on Kubernetes and future of Swarm is uncertain

Nomad:

  • Install is a simple binary
  • integration with Consul for HA
  • encrypted communications
  • no network setup
  • plugins for job executors including Docker
  • WAN setup supported by Consul
  • Support for Service, Batch and System jobs
  • Runs at large scale
  • Well supported by Hashicorp and community
  • job configuration in json or hcl

Mesos + Marathon:

  • Support for Docker and custom containerizer
  • No network setup by default
  • Runs at large scale at Twitter
  • Commercial support available
  • Complicated installation and setup
  • HA requires zookeeper setup
  • no federation or WAN support
  • Small community

Kubernetes:

  • Very popular with lots of managed options
  • Runs at large scale at many companies
  • Supports build extensions like Tekton and Argo
  • Federation support
  • Lots of support options and great community
  • Complicated setup and configuration
  • Requires setup and management of etcd
  • Requires setup and rotation of certs
  • Requires network overlay setup using one of 10+ network plugins like Flannel

In my experience with Wrigel, Docker Swarm has worked well. It is only its uncertain future that has encouraged me to look at Nomad.

Running Pipelines outside Jenkins

Many years ago I saw a reference to a small tool on Github called Walter. The idea is have a small go tool that can execute a sequence of tasks as specified in a yaml file. It can execute steps serially or in parallel. Each stage can have an unlimited number of tasks and some cleanup tasks. Initially it supported only two stages so I modified it to support unlimited stages. This tool can only handle a single node pipeline, but that covers a lot of use cases. Now the logic for building the pipeline is in the code that generates the yaml file and not inside a Jenkinsfile. Ideally a developer could download the yaml file and the walter binary and recreate the entire build sequence on a local development machine. The temptation is to have the yaml file call shell scripts, but by placing the full commands in the yaml file with proper escaping each command could be cut and pasted out of the yaml and run on a terminal.

Workflow Support

It turns out that Jenkins Pipelines are an implementation of a much larger concept called Workflow. Scientific computing has been building multi-node cluster workflow engines for a long time. There is a list of awesome workflow engines on Github. I find the concept of directed acyclic graphs of workflow steps as mentioned by Apache Airflow very interesting because it matches my mental model of some of our larger build jobs.

With a package like Luigi, the workflow can be encoded as a graph of tasks and executed on a scheduler using “contribs”, which are interfaces to services outside of Luigi. There are contribs for Kubernetes, AWS, ElasticSearch and more.

Conclusion

With a single node pipeline written in yaml and executed by walter and a multi node workflow built in Luigi, the build logic would be independent of the cluster manager and scheduler. A developer could run the workflows on a machine not managed by a cluster manager. The build steps could be fairly easily executed on a cluster managed by Jenkins, Nomad or Kubernetes. Combined with rootless containers the final solution would be much more secure than current solutions.

Pages