Choosing XFS
I manage a cluster of builder machines and all the builders use the
ext4 filesystem. To load the machines effectively, the builds are
heavily parallelized and using a RAID0 striped setup avoid IO
being a bottleneck on the builds. When RedHat 7 was released the
default filesystem was changed to xfs, I realized that it would be
a alternative to ext4 because RedHat wouldn’t have made that change if
xfs wasn’t a fast and solid filesystem. I recently got some new
hardware and started an experiment.
Default RAID settings
The system has 6 4TB disks and I created 2 RAID0 disks of 3 disks each
for a total of 12TB per drive. The machine has a battery backed RAID
controller and each drive had as its default settings: stripe size of
64KB, write back, adaptive read ahead, disk cache enabled and a few
more.
Creating the xfs drives
Once the machine was provisioned, I started reading about xfs
filesystem creation options and mount options. There were several
points of confusion:
-
Some web pages referred to a crc option which validates
metadata. This sounds like a good idea, but is not available with
the xfsprogs version on Ubuntu 14.04
-
I didn’t realize at first that the inode64 option is a mount option
and not a filesystem creation option
Since the disks are using hardware RAID which is not generally
detectable by the mkfs program, the geometry needs specified when
creating the drive.
parted -s /dev/sdb mklabel gpt
parted -s /dev/sdb mkpart build1 xfs 1M 100%
mkfs.xfs -d su=64k,sw=3 /dev/sdb1
These commands create the partition and tell xfs the stripe size and
number of stripes.
XFS mount options
It was clear that the inode64 was useful because the disks are large
and the metadata is spread out over the drive. The interesting option
was the barrier entry. There is an entry in the XFS Wiki FAQ
about this situation. If the storage is battery backed, then the
barrier is not necessary. Ideally the disk write cache is also
disabled to prevent data loss if the power is lost to the machine. So
I went back the RAID controller settings and disabled the disk cache
on all the drives and then added nobarrier,inode64,defaults
to the
mount options for the drives.
Conclusion
The experiment has started. The first build on the machine was very
fast, but the contribution of the filesystem is hard to determine. If
there are any interesting developments I will post updates.
In git 2.0, a new feature called bitmaps was added. The entry from the
git Changelog has the following entry:
The bitmap-index feature from JGit has been ported, which should
significantly improve performance when serving objects from a
repository that uses it.
One of my colleagues told me that he had experimented with it had
noticed some impressive speedups that I was able to reproduce. On the
local GigE network a linux kernel clone went from approx 3 minutes to
1.5 minutes, a speedup of almost 50%!
The instructions seemed very simple. Just log into the git server and
run:
on every bare repo. The first hurdle was upgrading to a newer version
of git. Our git servers are running CentOS 5, CentOS 6 and Ubuntu
14.04. The EPEL version of git is 1.8 and 14.04 ships with 1.9.1.
For Ubuntu 14.04 the solution was to use the LaunchPad
Git Stable PPA
But for CentOS, it was a little trickier. Since I hate distributing
binaries directly I decided to backport the latest Fedora git
srpm. Getting it to build required a few hacks with bash completion
and installing a few dependencies, but it took less than 30 minutes to
get both CentOS 5 and 6 rpms.
The upgrade of git on the servers worked very well because they are
using xinetd to run the git-daemon and the very next connection to the
server after the upgrade started using the newly installed git 2.3.5
binary.
There were of course a few hiccups. An internal tool that used git
request-pull was relying on one of the working “heuristics” (see
changelog) that were removed.
The next step was to repack all the bare repos on the server. So I
wrote a script to run git repack -A -b
and left it to run
overnight. Recovering from this the next few days would require me to
become very familiar with the git man pages.
First problem was that the git server ran out of disk space. Turns out
I needed to add the -d
flag in order to delete the previous pack
files. I had effectively doubled the disk space requirements of every
repo!
It also turns out the -A
leaves packfiles that contain dangling
objects. So I reran my script with
git gc --aggressive
git repack -a -d -b
This helped a lot but repos that were using alternates were still
taking a lot more space than before because repack was making one big
packfile of all the objects and effectively ignoring the alternates
file. This is documented in the git clone
man page.
So I went to all the repos with alternates and ran:
The -l
flag only repacks files that are not available in the
alternates. With some extra cleanup, this resulted in even less disk
space usage than before. Unfortunately this does mean that a repo with
alternates cannot have a bitmap.
On one server many repos still did not contain the bitmap file. After
much experimentation I finally figured out that the pack.packSizeLimit
option had been set on the server only to 500M. This meant that repos
larger than 500M would have multiple pack files and since the bitmap
requires a single pack file, no bitmap was created. The lack of
warning extended the debugging time considerably.
Finally one of my servers had an old mirror of the upstream Linux
kernel repo and even after git gc --aggressive
the repo was 1.5GB,
which is over 500MB larger than a new clone. So I started
experimenting with the other repack flags, including -F
. The result
was that the repo ballooned to over 4GB and I couldn’t find a way to
reduce the size. Even cloning the repo to another machine resulted in
a 1.5GB transfer. In the end, I ended up doing a fresh clone and
swapping the objects/pack directories.
I was able to reproduce the behavior with a fresh clone as well:
git clone --bare git://git.kernel.org/pub/scm/linux/kernel/git/stable/linux-stable.git
cd linux-stable
git repack -a -d -F
In summary:
-
To create bitmaps without increasing disk space usage:
-
I was not able to use git repack -F
in a way that did not
quadruple the size of the Linux kernel repo. It even caused clones
of the repo to be larger as well
-
Git should have a warning if bitmaps are requested but cannot be
created due to packSizeLimit restrictions. I plan to file a bug or
make a patch.
A long time ago I filed Docker 2891 issue regarding the
performance of the aufs backend vs devicemapper.
Quick summary is that the aufs backend was approx 30% slower even
though the build was being done in a bind mount outside of the
container.
I finally got around to checking again using Docker 1.5 on Ubuntu
14.04 with 3.16 utopic LTS kernel.
The current stable poky release is dizzy:
cd <buildarea>
mkdir downloads
chmod 777 downloads
git clone --branch dizzy git://git.yoctoproject.org/poky
source poky/oe-init-build-env mybuild
ln -s ../downloads .
bitbake -c fetchall core-image-minimal
time bitbake core-image-minimal
There is no need to set the parallel packages and jobs now in
local.conf because bitbake now chooses reasonable defaults.
Bare Metal:
real 29m59.190s
user 278m0.988s
sys 59m47.379s
Devicemapper:
real 32m21.074s
user 281m53.994s
sys 68m45.554s
AUFS:
real 37m14.612s
user 259m19.226s
sys 85m50.269s
I only ran each build once so this is not an authoritative
benchmark. It shows that there is a performance overhead of approx 20%
when using the aufs backend even if the IO is done on a bind mount.