btrfs and 2.6.31

I’d read about quite a few new features in the linux 2.6.31 kernel,so I thought I’d download the official source for 2.6.31 from kernel.org and build a custom kernel on my Debian Lenny 64 bit core2duo system. Thats the usual make-kpkg melarchy which takes an eternity.

It took me a while to get 2.6.31 to boot properly. I have an older nvidia card in this system, and the default setup had the kernel loading the nvidiafb module … which does not seem to work with my system (I just get a black or gray screen). Fortunately I could still ssh in from another box. It took me a while to work out how to properly disable nvidiafb, which ended up being somehting like;

cd /etc
mv modules.conf modules.conf.orig
cd modprobe.d
# Edit the blacklist file and add 'blacklist nvidiafb' to the end.
vi blacklist
# now update the initrd
update-initramfs -u

Also, in order to get X windows working I ended up downloading the driver from nvidia.com ( I used NVIDIA-Linux-x86_64-185.18.36-pkg2.run).

Once I had it booted (properly) off 2.6.31, I thought I’d have a look at btrfs. It’s a new filesystem for linux which is meant to be a lot like ZFS, but it’s not quite there yet in the stability department. However, I thought I read somewhere that the version included in 2.6.31 was ‘pretty good’, so it was worth a look. What I wanted to try was to have a btrfs root partition.

The first thing was to recompile my kernel again. Btrfs seemed to be turned off by default if you’re doing the make-kpkg stuff. So my setup process was something like;

cd /usr/src
tar xvjf /tmp/linux-2.6.31.tar.bz2
mv linux linux.orig
ln -s linux-2.6.31 linux
cd linux
make-kpkg clean
make menuconfig  <--- make sure you go to filesystems and enable btrfs as a module)
fakeroot make-kpkg --initrd --append-to-version=-custom2 kernel_image kernel_headers
# Wait 10 years
cd ..
dpkg -i linux-headers-2.6.31-custom2_2.6.31-custom2-10.00.Custom_amd64.deb
dpkg -i linux-image-2.6.31-custom2_2.6.31-custom2-10.00.Custom_amd64.deb

Now my plan was to make a copy of my root fs into a btrfs partition and boot off that. I have /boot as an ext2 partition, and my normal lenny root fs is on a logical volume as an ext3 fs. I thought I'd take a snapshot of the ext3 rootfs and copy it into a btrfs logical volume and boot off that. But the first thing to do is to get the btrfs user space tools;

git clone git://git.kernel.org/pub/scm/linux/kernel/git/mason/btrfs-progs-unstable.git
cd btrfs-progs-unstable
make 
make install

Then setup the logical volumes and copy the data (NB: You might have to do a 'modprobe libcrc32c ; modprobe btrfs' first). For reference, I use a volume group called vgz which will undoubtedly be different to whatever system you might have.

mkdir /tmp/src
mkdir /tmp/dest
lvcreate -L 1G -s -n snap1 /dev/vgz/lvlenny64root
mount /dev/vgz/snap1 /tmp/src
lvcreate -L 12G -n lvlenny64btrfsroot /dev/vgz
mkfs.btrfs /dev/vgz/lvlenny64btrfsroot
mount -t btrfs /dev/vgz/lvlenny64btrfsroot /tmp/dest
cd /tmp/src
find . -print | cpio -dumpv /tmp/dest
cd /
# You'll need to edit the fstab to change the root device and filesystem type. For example:
#    /dev/mapper/vgz-lvlenny64btrfsroot /               btrfs    errors=remount-ro 0       1
vi /tmp/dest/etc/fstab
umount /tmp/dest
lvremove /dev/vgz/snap1
umount /tmp/src

Now we need to convince your initrd to load the btrfs modules (this is debian/ubuntu/whatever specific). I edited /etc/initramfs-tools/modules and added these lines in;

libcrc32c
zlib_deflate
crc32c
btrfs

And then you need to run update-initramfs -u
At this point I had to do some fiddling with /boot/grub/menu.lst. I just made an extra stanza in the file that looked like;

title           BTRFS Debian GNU/Linux, kernel 2.6.31-custom2
root            (hd1,1)
kernel          /vmlinuz-2.6.31-custom2 root=/dev/mapper/vgz-lvlenny64btrfsroot rootfstype=btrfs ro
initrd          /initrd.img-2.6.31-custom2

Obviously don't just copy this for your own setup, as the root line will be different and possibly some other stuff. Admitedly, I am probably missing something with this setup as when I go to boot off my btrfs root partition, I get kernel boot message up to the point where it's mounting the root filesystem, then just sits there for 5 minutes before eventually mounting it ok and continuing on. It's all a bit odd (UPDATE: This long pause is to do with the initramfs fstype and/or vol_id tools not being able to recognise btrfs filesystems. Eventually someone will update those tools, but until then you might want to edit /usr/share/initramfs-tools/scripts/local, hack the get_fstype function and then run update-initramfs-u. I'll put some notes at the bottom of the post).

Anyway, so I got a bootable system with btrfs. The first thing I wanted to try was snapshots, so I duely did something like;

btrfsctl -s snap1 /

When I first did that I thought it would create a snap1 directory under the / (ie. root) of my root fs. It didn't. It actually created a snap1 directory in the directory I was currently in. So within that directory was an intact snapshot of my root filesystem. Of course the first thing you want to do is delete the snapshot ... which in 2.6.31 is a 'future feature'. You can try and rm -rf the snapshot directory ... but I always got some circular file reference errors and it took ages as well. In fact I took a few snapshots as I played around, then left it running over night. In the morning it told me my filesystem was full and the hard drive light was permanantly on. Odd.

Fortunately, the ability to delete snapshots has actually been added to btrfs. You just need an even more recent kernel ... or what I did was to just check out the btrfs kernel tree (I went to here and clicked 'tree' up the top, then 'fs', then 'btrfs' then the 'snapshot' button near the top) and shove it into my 2.6.31 kernel and go through the whole make-kpkg thing again and try booting again (actually I did a few more steps and just deleted my whole btrfs example logical volume and created a new one with no snapshots).

Now with my '2.6.31 and a bit' kernel I can make a snapshot (I have now learned that putting a leading / in front of the snapshot name is probably sensible);

btrfsctl -s /snap1 /

And so if I cd /snap1 I can access my snapshot. But more importantly I can delete it;

btrfsctl -D snap1 /

The arguments to the 'btrfsctl -D' seem to be the snapshot name followed by the directory in which it was made.

I also found this post on root'ing the net that shows how you can mount the snapshot using the mount command (ie. in addition to the snapshot appearing as a subdirectory);

mount -o subvol=snap1 /dev/vgz/lvlenny64btrfsroot /mnt

Most stuff on btrfs indicates that subvols and snapshots are treated the same. If you create a new snapshot you get a copy of what you snapshotted. Creating a new subvol gives you an empty directory which I assume can be controlled separately, but uses the same pool of disk resources of my main btrfs filesystem. I'm not entirely sure. There is an option to the btrfsctl command which can resize a 'something' but I'm not too sure exactly what it does. I tried resizing down and df doesn't reflect any difference, but doing a btrfs-show did show some change. At the moment there aren't really any commands to list snapshots or show usage. There is a btrfs-debug-tree command that spits out reams of info ... and does mention my snapshots deep within it.

btrfs can apparently do RAID1, RAID0 and RAID10. I like that redundancy is a feature of the filesystem. I've never really understood why LVM2 for linux has no redundancy features (and the mirroring capability it does have is not very useful). I haven't tried any of the btrfs RAID features yet.

So that's all I've looked at so far. One of the key features of ZFS that I'd really like to see in btrfs is 'rollback to snapshot'. This is such an incredibly useful feature in a fast paced IT environment. Given that btrfs is a copy-on-write filesystem, I am hoping they put rollback in at some point.

Honestly I haven't tested it enough to determine how stable it is. It seems fine so far, but this is not a server type machine.

UPDATE: re getting rid of the long pause on boot.
I edited /usr/share/initramfs-tools/scripts/local, and added the extra lines shown below. Be warned, this is an awful 'hack' that assumes that any filesystem type that is not recognised is 'btrfs'. Once you've updated the file you'll need to run update-initramfs -u

get_fstype ()
{
        local FS FSTYPE FSSIZE RET
        FS="${1}"

        # vol_id has a more complete list of file systems,
        # but fstype is more robust
        eval $(fstype "${FS}" 2> /dev/null)
        if [ "$FSTYPE" = "unknown" ] && [ -x /lib/udev/vol_id ]; then
                FSTYPE=$(/lib/udev/vol_id -t "${FS}" 2> /dev/null)
        fi
        RET=$?

        if [ -z "${FSTYPE}" ]; then
                FSTYPE="unknown"
        fi

# Hack for my btrfs root. Terrible hack
        if [ "${FSTYPE}" = unknown ];then
                FSTYPE=btrfs
                RET=0
        fi
# End of my hack

        echo "${FSTYPE}"
        return ${RET}
}