Introduction

Glossary

LVM terminology can be a little confusing but it's actually very straightforward.

VG Volume Group: A collection of PVs used to allocate space for one or more LVs.
PV Physical Volume: A storage device (usually a physical hard disk but may be a software raid device) that provides PEs for a VG.
LV Logical Volume: A block device that can contain a file system. Each LV is associated with a single VG, and each VG contains one or more PVs.
PE Physical Extent: Chunks of data on PVs, usually 4MB. These are the same size as the LEs for the VG.
LE Logical Extent: Chunks of data on LVs. The size of each LE is the same for all LVs of a VG.

If you want to mount a volume, you need to allocate an LV to hold the data. But first you need a PV to provide the collection of PEs that will contain the LV. And, once you create the PV, you need to create a VG to contain it

An Example

The PEs on the VG named Vig are 4 MB in size. Vig resides on 2 partitions, /dev/hda1 and /dev/hdb1. One PV resides on hda and contains 99 PEs, the other PV resides on hdb and contains 248 PEs. That gives us 374 available PEs, or around 1.5GB of space.

In this case, the system administrator decides to create a striped (raid0) partition using the 99 PEs from PV1 and 99 PEs from PV2, producing a fast 792 MB partition. She calls it "Stripey." She also creates a 704MB LV from Vig's remaining 176 PEs and calls it "Remainder." She creates filesystems on /dev/mapper/Vig-Stripey and /dev/mapper/Vig-Remainder and mounts them just like she would any other block device.

todo: link to examples of resizing and retargeting arrays.

LE Layout, Striping

You can choose how LEs are mapped onto PEs.

  • Linear -- first PV is filled, then second PV, and so on. This is nice when it comes time to shrink the volume.
  • Striped -- interleave all LEs over all the PVs. This might offer better performance because I/O can be spread over multiple spindles. Adding more PVs can muck this up, of course. You might end up with the first set of LEs striped across 3 disks, then add another disk causing its LEs are to be linear on that disk, then add two more disks and stripe them. LVM handles this beautifully but it can get confusing when you're trying to figure out exactly where an LV resides.

LV nodes are created in the /dev/mapper directory and named "PVname-LVname".

Commands

lvs: brief summary of known LVs.
pvs: brief summary of known PVs.
vgs: brief summary of known VGs.
vgchange: modify whether a vg is available or resizeable.
vgcreate: creates a new VG
vgdisplay: displays information about all known VGs (-s for short output).
vgexport: makes inactive VGs unknown to the system. You can then move the PVs associated with the VGs to a different system.
vgextend: adds PVs to a VG
vgimport: makes a previously exported VG known to the system again.
vgmerge: allows you to merge inactive VGs into another VG.
vgreduce: removes PVs from a VG. Specify -a to remove all unused PVs.
vgremove: deletes a VG. First remove all LVs using lvremove.
vgrename: renames an existing VG.
vgsplit: creates a VG and moves LVs from an existing VG into the new one.
lvchange: changes some attributes of a particular LV (available, contiguous, read-only)
lvcreate: creates a new LV in a VG
lvdisplay: prints information about an LV.
lvextend: grows an LV by adding LEs. You can optionally specify the PV that should supply the space.
lvreduce: shrinks an LV. LEs are always removed from the end of the block.
lvremove: deletes an LV.
lvrename: renames an LV.
lvresize: resizes an LV (same as lvextend/lvreduce)
pvchange: can prevent PVs from allowing allocation of PEs (in preparation for taking it offline perhaps).
pvcreate: initializes PVs. You then add the PVs to VGs using vgextend.
pvdisplay: displays attributes of PVs. Add -m to show mapping of PEs to LEs.
pvmove: allows you to move PEs from one PV to another. Can restrict the PEs moved to a specific LV. If no destination is supplied then pulls PEs from default PV.
pvremove: wipes the label on a device so it will no longer be recognzied as a PV.
pvresize: resizes a physical volume

Show information on all known (mounted?) LVs: lvdisplay Show resources consumed by all LVs on a PV: lvdisplay PVname Show resources consumed by an LV: lvdisplay node (where node is i.e. /dev/Hydra/HydraSwap) Show what PEs a particular LV occupies: lvdisplay -m

lvscan: scans all VGs and attached disks for LVs.
vgscan: scans all attached disks for VGs. Runs automatically at boot. You might want to run it after hot-plugging more storage devices.
pvscan: scans all attached disks for PVs.

?? Do these scans also "mount" the items, or do they just print their results?

vgconvert: converts an LVM1 VG to LVM2 format.
vgmknodes: maintains the special files in /dev. Normally udev will cause a node to appear in /dev/mapper the instant you create an LV so there should be no need to ever call this.
vgck: checks VG metadata for consistency.
vgcfgbackup: backs up the metadata for all VGs into /etc/lvm/backup.

Examples

Starting LVM from Scratch

Creating, Shrinking, and Growing LVs

This assumes that you've already set up the VG and the PVs. You just need to allocate an LV and mount it on your machine. We will create an 8GB LV named Rinspin in the Hydra VG.

 lvcreate --size 8GB --name Rinspin Hydra
 mkfs.ext3 /dev/mapper/Hydra-Rinspin
 mount /dev/mapper/Hydra-Rinspin /vservers/rinspin

To shrink the volume:

 umount /vservers/rinspin
 resize2fs -p /dev/mapper/Hydra-Rinspin 4G
 lvreduce --size 4G Hydra/Rinspin
 mount /dev/mapper/Hydra-Rinspin /vservers/rinspin

To grow the volume. In this case, all our PEs are on a single PV so we don't care where LVM allocates them from. If we did care, we would name one or more PVs after the LV name. To obtain more unused PEs, you can use vgextend to add more PVs to the VG. Or you can use lvreduce to shrink another LV and then allocate its newly-freed PEs to your new LV. We don't specify the size for resize2fs because it can just obtain it from the partition size.

 umount /vservers/rinspin
 lvextend --size +4G Hydra/Rinspin
 resize2fs -p /dev/mapper/Hydra-Rinspin
 mount /dev/mapper/Hydra-Rinspin /vservers/rinspin

Other Features

Snapshots

You can snapshot a device to freeze it in time. Once you're done with the snapshot, you delete it, and the resources it occupied will be returned to the array.

Snapshots are especially useful when taking backups. You simply run a script to quiesce your databases, take a snapshot, and start everything running again. Downtime should be on the order of a second or two. After that, you can take as long as you want to copy the snapshot to your backup medium.

Snapshots are read-write by default, which opens up some pretty amazing rollback and branching possibilities.

Create snapshots using lvcreate -s. You can then mount the snapshot as you would any other volume.

?? how exactly

Redundancy, Using Software Raid

Even though you can stripe and mirror LVs directly, I don't recommend it. The Linux MD tools have been tested thoroughly and include strong error reporting and recovery. LVM has not, has weak error reporting, and I have no idea how you'd try to recover a failed LVM array. If you want redundancy, I recommend first assembling an MD array or arrays, then layering LVM on top. It's easy: tell mdadm what physical devices to use, and then tell pvcreate to use the resulting /dev/mdXX devices.

LVM-on-MD Corner Case

Well, I did run into one obscure corner case when running LVM-on-MD.

Let's say I have an 106 GB ext3fs filesystem on a 106 GB LVM volume (pv="hmd", lv="vservers") that is itself running on an MD raid1 array /dev/md1 over two 106 GB physical partitions, hda2 and hdc2. I want to shrink the raid1 md array to 100G (I want to use the freed up space for striping).

Warning! This didn't work. Make sure to back your data up before you attempt this.

  • umount /vservers -- unmount the filesystem
  • resize2fs -p /dev/mapper/hmd-vservers 100GB
  • pvresize --setphysicalvolumesize 100G /dev/md1 (use "pvs" to verify sizes)
  • mdadm --grow --size=104857600 /dev/md1

Here I wish I could just use parted to resize the two partitions. Alas, parted for some reason needs to recognize the filesystem before resizing. ext3 over lvm over md? Good luck, parted! And, strangely, parted doesn't have an force option. So, we must delete the old partition and then create a new, smaller partition using fdisk. This only changes the partiton table, not the data in the partition, so as long as the start position doesn't change, the on-disk data should be fine.

The new partitions should start at the beginning of the new free space, be 100GB in size, and be of type 0xFD (Linux raid autodetect).

  • cfdisk /dev/hda -- edit partitions
  • cfdisk /dev/hdc -- edit partitions
  • shutdown -r now -- ensure new partiton table takes hold

No! It failed! md didn't restore md1 so lvm scanned the disks and found its data hda2 and hdc2. I had to wipe the PVs and start over. I'm not sure how to solve this... somehow prevent lmv from coming back up until you've rebooted and set up the new MD partition?

Lovely, now I get to rebuild my array...

 mdadm --create /dev/md1 --level=1 --raid-devices=2 /dev/hd[ac]2
 pvcreate /dev/md1
 vgcreate mdvg /dev/md1
 lvcreate -L2G -nswap mdvg
 mkswap /dev/mapper/mdvg-swap
 swapon /dev/mapper/mdvg-swap
 lvcreate -L40G -nvservers mdvg    (leave 60G free for future use)
 mke2fs -j /dev/mapper/mdvg-vservers
 mount /dev/mapper/mdvg-vservers /vservers

Well, I'm back to where I was. Where did I go wrong?

Make sure to check that everything looks good in "pvs" and "mdadm --query /dev/md1". Also handy: cat /proc/mdstat, mdadm --query --detail /dev/md0

And here is how to set up the LVM raid0 array:

 (why did I choose raid via lvm instead of raid via mdadm?  no good reason, other than
  lvm is much easier to exted if need be)
  pvcreate /dev/hda4
  pvcreate /dev/hdc3
  pvs -- (shows your new pvs and the devices by which lvm knows them)
  vgcreate vg2 /dev/evms/hda4 /dev/evms/hdc3
  pvs -o name,pe_count
  lvcreate -i2 -I4 -l954 -nspeedy vg2 -- for -l, pass the sum of the extents listed by pvs.  -I4 specifies 4K stripes, to match the ext2 block size.
  mke2fs -j /dev/mapper/vg2-speedy
  mkdir /home/bronson/speedy
  mount /dev/mapper/vg2-speedy /home/bronson/speedy