FRIDAY, 26 SEPTEMBER 2014

Git Server option bigFileThreshold ∞

Introduction

I manage the git infrastructure for the Linux group at Wind River: the main git server and 5 regional mirrors which are mirrored using grokmirror. I plan to do a post about our grokmirror setup. The main git server holds over 500GB of bare git repos and over 600 of those are mirrored. Many repos are not mirrored. Some repos are internal, some are mirrors of external upstream repos and some are mirrors of upstream repos with internal branches. The git server runs CentOS 5.10 and git 1.8.2 from EPEL.

The toolchain binary repos

One of the largest repos contains the source for the toolchain and all the binaries. Since the toolchain takes a long time to build, it was decided that Wind River Linux should ship pre-compiled binaries for the toolchain. There is also an option which allows our customers to rebuild the toolchain if they have a reason to.

The bare toolchain repo size varies between 1 and 3GB depending on supported architectures. Many of the files in the repo were tarballs around 250MB size.

Why is the git server down again?

When a new toolchain is ready for integration, it is uploaded to the main git server and mirrored. Then the main tree is switched to enable the new version of the toolchain and all the coverage builders start to download the new version. Suddenly the git servers would become unresponsive and would thrash under memory pressure until they would be inevitably rebooted. Sometimes I would have to disable the coverage builders and stage their activation to prevent a thundering herd from knocking the git server over again.

Why does cloning a repo require so much memory?

I finally decided to investigate this and found a reproducer quickly. Cloning a 2.9GB bare repo would consume over 7GB of RAM before the clone was complete. The graph of used memory was spectacular. I started reading the git config man page and asking google various questions.

I tried setting the binary attributes on various file types, but nothing changed. See man gitattributes for more information. The default set seem to be fine.

I tried various git config options like core.packedGitWindowSize and core.packedGitLimit and core.compression as recommended in many blog posts. But the memory spike was still the same.

core.bigFileThreshold

From the git config man page:

Files larger than this size are stored deflated, without attempting delta compression.
Storing large files without delta compression avoids excessive memory usage, at the slight
expense of increased disk usage.

Default is 512 MiB on all platforms. This should be reasonable for most projects as source
code and other text files can still be delta compressed, but larger binary media files
won’t be.

The 512MB number is key. The reason the git server was using so much memory is because it was doing delta compression on the binary tarballs. This didn’t make the files any smaller because they were already compressed and required a lot of memory. I tried one command:

git config --global --add core.bigFileThreshold 1

And suddenly (no git daemon restart necessary), the clone took a fraction of the time and the memory spike was gone. The only downside was that the repo required more disk space; about 4.5GB. I then tried:

git config --global --add core.bigFileThreshold 100k

The resulted in approx 10% more disk space (3.3GB) and no memory spike when cloning.

This setting seems very reasonable to me. The chance of having a text file larger than 100Kb is very low and the only downside is slightly higher disk usage. Git already is very efficient in this regard.

UPDATE This setting can cause disk space issues on linux kernel repos. See update here

FRIDAY, 8 NOVEMBER 2013

Replacing disks before they fail ∞

Hardware setup

I am managing an R710 Dell server with 6 2TB disks. The RAID controller does not support JBOD mode, so I had to create 6 RAID0 virtual disks with one disk per group. The disks are then passed through to Linux as /dev/sda to /dev/sdf. I am running 6 xen vms and each vm gets a dedicated disk. The vms are coverage builders and not mission critical so there is no point in added redundancy. I have a nice Cobbler/Foreman setup that makes provisioning very quick.

OpenManage and check_openmanage

I am running the Dell OpenManage software on the system. If fact I am running it on all my hardware. I am using the puppet/dell module graciously shared on Github. The OpenManage package does many things including CLI query access to all the hardware.

Then I stumbled across check_openmanage which is a Nagios check which queries all the hardware and notifies Nagios if there are any problems. I had already used the Puppet integration with Nagios to setup a bunch of checks for ntp, disk and some other services. To make things even easier, check_openmanage is in EPEL and Debian. It did not take much time to add this check to the existing checks.

Predicted Failure

So once everything was setup, I started getting warned about many things that I was not aware of like firmware out of date and that some hard drives were predicted to fail. The output of check_openmanage looks like this:

WARNING: Physical Disk 1:0:4 [Seagate ST32000444SS, 2.0TB] on ctrl 0 is Online, Failure Predicted

A reasonably painless call to Dell and a replacement disk is shipped.

Disk replacement

When a disk fails it has a really nice blinking yellow light. To make things clean, I wanted to shutdown and delete the correct vm before changing the disk. How to figure out the correct vm to shutdown.

> omreport storage pdisk controller=0 pdisk=1:0:4
Physical Disk 1:0:4 on Controller PERC 6/i Integrated (Embedded)
Controller PERC 6/i Integrated (Embedded)
ID                              : 1:0:4
Status                          : Non-Critical
Name                            : Physical Disk 1:0:4
State                           : Online
Failure Predicted               : Yes

> omreport storage pdisk controller=0 vdisk=5
List of Physical Disks belonging to Virtual Disk 5
Controller PERC 6/i Integrated (Embedded)
ID                              : 1:0:4
Status                          : Non-Critical
Name                            : Physical Disk 1:0:4

Okay found the correct physical disk and the associated virtual disk.

> omreport storage vdisk controller=0 vdisk=5
Virtual Disk 5 on Controller PERC 6/i Integrated (Embedded)
ID                            : 5
Status                        : Ok
Name                          : Virtual Disk 5
State                         : Ready
Device Name                   : /dev/sdf

Okay I know that this physical disk maps to the device /dev/sdf and I initiated a shutdown of the vm that uses that disk.

The disk with predicted failure has a flashing amber light which makes it easy to figure out which one to swap.

Once the swap is complete run the following command to recreate the vdisk.

omconfig storage controller controller=0 action=createvdisk raid=r0 size=max pdisk=1:0:4

And /dev/sdf is available once again.

FRIDAY, 30 AUGUST 2013

OpenStack Grizzly deployment using puppet modules ∞

Openstack Grizzly 3 node cluster installation

There is a lot of infrastructure that I leveraged to do this installation:

Local ubuntu mirror
Debian Preseed files to automate installation
Dell iDRAC and faking netboot using virtual CDROM
Puppet master with git branch to environment mapping
Git subtrees to integrate OpenStack puppet modules
An example hiera data file to handle configuration

Local Ubuntu mirror

Having a local mirror makes installations much simpler because packages download very quickly. The ideal setup uses netboot because the mirror already contains the kernel and initrd and packages needed to do the installation. I used:

ubuntu/dists/precise/main/installer-amd64/current/images/netboot/ubuntu-installer/amd64/linux
ubuntu/dists/precise/main/installer-amd64/current/images/netboot/ubuntu-installer/amd64/initrd.gz

To create the mirror I used the ubumirror scripts provided by Canonical.

Debian Preseed

I already have some experience using debian preseed files to automate installation of Ubuntu and Debian. The documentation is spread out all over the Internet. Most of the preseed is just sets the local mirror and network setup. The OpenStack related options were the disk layout and adding the Ubuntu Cloud Archive.

Openstack Compute Node disk layout

The machines I am using were purchased before I even knew OpenStack existed. They were used for Wind River Linux coverage builds and the simplest configuration uses 2 900GB SAS drives in RAID0. The builds require a lot of disk space and builds on SSD and in memory provided only a small speedup versus the increase in cost.

My idea was to use LVM and allow cinder to use the remaining space to create volumes for the vms. Here are the relevant preseed options to handle the disk layout.

d-i partman-auto/method string lvm
d-i partman-auto/purge_lvm_from_device  boolean true
d-i partman-auto-lvm/new_vg_name string cinder-volumes
d-i partman-auto-lvm/guided_size string 500GB
d-i partman-auto/choose_recipe select atomic

There are 3 kinds of storage in OpenStack: instance/ephemeral, block and object.

Object storage is handled by swift and not part of this installation.
Block storage is done by default using iscsi and LVM logical volumes. Cinder looks for a LVM volume group called cinder-volumes and creates logical volumes there.
Instance/Ephemeral storage by default goes into /var on the root filesystem. This is why I made the root filesystem 500GB. But this does not allow live migration because the root filesystem is not shared. If the vm was booted using block storage then the iscsi driver can handle the migration of vms. Another option is to mount /var on a shared nfs drive.

Ubuntu Cloud Archive

I added the cloud and puppetlabs apt repos in the preseed to prevent older versions of packages being installed.

d-i apt-setup/local0/repository string \
    http://apt.puppetlabs.com/ precise main dependencies
d-i apt-setup/local0/comment string Puppetlabs
d-i apt-setup/local0/key string http://apt.puppetlabs.com/pubkey.gpg

d-i apt-setup/local1/repository string \
    http://ubuntu-cloud.archive.canonical.com/ubuntu precise-updates/grizzly main
d-i apt-setup/local1/comment string Ubuntu Cloud Archive
d-i apt-setup/local1/key string \
    http://ubuntu-cloud.archive.canonical.com/ubuntu/dists/precise-updates/grizzly/Release.gpg

tasksel tasksel/first multiselect ubuntu-server
d-i pkgsel/include string openssh-server ntp ruby libopenssl-ruby \
    vim-nox mcollective rubygems git puppet mcollective facter \
    ruby-stomp puppetlabs-release ubuntu-cloud-keyring

Dell iDRAC and faking netboot using virtual CDROM

Unfortunately I do not have DHCP, PXE and TFTP in this subnet to do netboot provisioning. I am working on this with our IT department. So for now I have to fake it.

I grab the mini.iso from the Ubuntu mirror

ubuntu/dists/precise/main/installer-amd64/current/images/netboot/mini.iso

This contains the netboot kernel and initrd. I can then log into the Dell iDRAC and start the remote console for the server. Using Virtual Media redirection, I connect the mini.iso and boot the server. Press F11 to get the boot menu and select Virtual CDROM.

But using this directly means I have to type everything into a tiny console window. So I modified the isolinux.cfg to change the kernel params to load the preseed automatically

Mount mini.iso locally and copy the contents to the hard drive

sudo mount -o loop mini.iso /mnt/ubuntu/
cp -r /mnt/ubuntu/ .
chmod -R +w ubuntu

Here are the contents of the isolinux.cfg after editing:

default preseed
prompt 0
timeout 0

label preseed
    kernel linux
    append vga=788 initrd=initrd.gz locale=en_US auto \
        url=<server>/my.preseed priority=critical interface=eth0 \
        console-setup/ask_detect=false console-setup/layout=us --

Then make a new iso:

mkisofs -o ubuntu-precise.iso -b isolinux.bin -c boot.cat \
    -no-emul-boot -boot-load-size 4 -boot-info-table -R -J -v -T ubuntu/

Then the process is almost completely automated. Except that the server cannot download the preseed until the networking is configured. This info can be added to the kernel params, but then I would have to edit each iso for each server. With RedHat kickstarts I was able to add a script that mapped MAC address to IP and completely automate this. But with preseeds I need to manually enter the network info. The proper solution is a provisioner like Cobbler or Foreman.

Puppet master with git branch to environment mapping

I have setup my puppet masters based on the post by Puppetlabs:

I like this setup a lot. All development happens on my desktop and I have a consistent version controlled collection of all modules available to my systems. I am using it give some colleagues that are learning puppet a nice environment that won’t mess up my systems.

But I have some custom in-house modules and I want to put the OpenStack puppet modules in the same git branch beside them. The existing tools like puppet module and puppet librarian, etc. do not work in this use case. I want to be able to use git for these external repos and be able to easily share any patches I make with upstream. Enter git subtree.

Git subtrees to integrate OpenStack puppet modules

Git subtree is part of the git package contrib files. Enabling it on my system was simple:

cd ~/bin
cp /usr/share/doc/git/contrib/subtree/git-subtree.sh .
chmod +x git-subtree.sh
mv git-subtree.sh git-subtree

Now I can go to my modules directory and add in the OpenStack puppet modules

for arg in cinder glance horizon keystone nova; do \
    git subtree add --prefix=modules/$arg \
      --squash https://github.com/stackforge/puppet-$arg stable/grizzly;\
done

There are some more supporting modules like inifile, rabbitmq, apt, vcs, etc. Look in openstack/Puppetfile for the full list.

Next was to enable the modules on my machines. First the hiera data needs to added for the network config. I was inspired by Chris Hodge’s video and hiera data

The gist has some minor issues. I posted a revised version.

The last piece is to enable the modules on the nodes

node 'controller' {
    include openstack::repo
    include openstack::controller
    include openstack::auth_file
    class { 'rabbitmq::repo::apt':
        before => Class['rabbitmq::server']
    }
}
node 'compute' {
    include openstack::repo
    include openstack::compute
}

Conclusion

Most of this infrastructure already existed or I had already done in the past. I was able to reimage 3 machines and have a working grizzly installation in about 3 hours.

Many thanks to all people who have contributed to Debian, Ubuntu, Puppet and the OpenStack puppet modules.

Previous Page: 11 of 15 Next Archive