LVM2 – KernelCrash

May 10, 2007

When I first started out working with UNIX it was with HP-UX systems almost 15 years ago. HP-UX 9 came standard with a logical volume manager (LVM). Some years later I first started looking at linux and the standard partitioning scheme in linux seemed archaic. That was a long long time ago. I had always hoped that linux would use a LVM as the standard disk management system. LVM2 is the main linux logical volume manager. Its been around for a few years now, but it’s only quite recent that its becoming a standard option as an installation method ins some distros. I’ve looked at Redhat 5 (Centos 5) and Debian Etch lately. And they all have LVM as a standard option.

So whats so good about it?

Like lots of computing ideas, LVM is about ‘creating a layer of abstraction to help solve a problem’. I like to think of it like this; “When the OS tries to read/write a sector on disk, instead of just working out the offset of that sector from the start of a standard partition, it takes an extra step and looks up a new location in a lookup table and uses that instead”. Sure, there’s an extra step, but the lookup table is most likely in memory (so its quick), and the lookup table is not for every sector … it actually works in bigger chunks of data called ‘extents’. It also works across disks. So instead of limiting your /home partition to one disk, the data can spread across multiple disks yet appear to be one big filesystem.

You also get the ability to take snapshots of filesystems. They start out as a ‘point-in-time’ image of a filesystem. This is great for backups. A typical scenario is if you want to have a cold backup of a database, so it needs to be shutdown, but you want the outage to be short. With snapshots, you 1) shutdown the database, 2) take a snapshot, 3) start up the database, 4) mount snapshot and take a few hours to back it up. Steps 1 to 3 can happen within a few minutes, but you can take as long as you like backing the data up.

You can move underlying data around. Why is that good? Lets say you group together 3 disks and you think that one of those disks is about to die. You buy another disk and add it to the group. Now you can tell LVM2 to move data off the disk you think is bad and shove it on the new disk. You can do this with all your filesystems still mounted and everything running. It does it all in the background and it will take its time and actively slow down when OS process disk IO needs priority. Its very clever. The way it moves data is to first create a mirror of the disk you want to remove. This means you can lose power during the process, and the system should just revert to using the old disk (ie. You shouldn’t end up with a stuffed system).

You can now mirror data (this is relatively new for LVM2). Commercial LVM’s like Veritas’s Volume Manager have always had the ability to create redundancy (ie. mirrors etc), but LVM2 seemed to hesitate going down this path with some argument about redundancy not being the role of an LVM. You’ll find lots of howtos on the net about setting up LVM2 on top of md devices (the linux md driver gives you RAID0, RAID1, RAID5 at a disk partition level). Its a bit complicated but it does work. Anyone from a Veritas background will probably think its silly. But now, you can at least mirror using LVM2 … which makes mirroring the root disk a very simple option

So what’s bad about it?

For many linux users its a new and difficult to understand concept (oh no, not another bloody abstraction layer). Many people think it’ll be much slower … but its not. And some might be worried about recovery? I’ve been runnning a “LVM2 on top of md RAID5” server setup for almost a year now with no issues. As a precaution, I backup my LVM2 config to a regular partition but I’ve never had to use it. The Centos and Debian install CDs I’ve tried all seem to autodetect the existing LVM structure on my disks by seemingly autoscanning at boot.

For most linux distro’s you’ll still need /boot mounted on a regular partition. This is mainly because boot loaders like Grub and Lilo are unable to read the kernel off LVM.

Terminology

Volume Group – a group of disks

Logical volume – like a partition on a disk except its carved out of a volume group, so it can span multiple disks

Extents – LVM manages data in units of extents. Modern large disks generally work best with larger extents (eg 2GB).

Snapshots – artificial logical volumes created as ‘point-in-time’ snapshots of existing logical volumes. In LVM2 you can read and write to them. They’re very clever since they don’t use that much disk space. In effect, LVM2 keeps track of changes to both the original logical volume and the snapshot one, only using disk space for the ‘changes’.

linux

KernelCrash

Leave a Reply