Page MenuHomePhabricator

role::labs::lvm::mnt ends up with make-instance-vg: failed to create new partition
Closed, ResolvedPublic

Description

On beta cluster we are making use of role::labs::lvm::mnt to allocate the instance disk space to /mnt .

I created a new Trusty instance deployment-cxserver02.eqiad.wmflabs which has a puppet failure:

Notice: /Stage[main]/Labs_lvm/Package[lvm2]/ensure:

ensure changed 'purged' to 'present'

Notice: /Stage[main]/Labs_lvm/File[/usr/local/sbin/make-instance-vg]/ensure:

defined content as '{md5}d427b6b327e1e4f7f5c8c436eb5667a6'

Notice: /Stage[main]/Labs_lvm/Exec[create-volume-group]/returns:

Error: Can't create any more partitions.

/usr/local/sbin/make-instance-vg: failed to create new partition
Error: /usr/local/sbin/make-instance-vg '/dev/vda' returned 1 instead of one of [0]

Error: /Stage[main]/Labs_lvm/Exec[create-volume-group]/returns: change from notrun to 0 failed: /usr/local/sbin/make-instance-vg '/dev/vda' returned 1 instead of one of [0]

Notice: /Stage[main]/Labs_lvm/File[/usr/local/sbin/make-instance-vol]/ensure: defined content as '{md5}24f46d3c5b16cbf0cca687b84a81b3cd'
Notice: /Stage[main]/Role::Labs::Lvm::Mnt/Labs_lvm::Volume[second-local-disk]/Exec[create-vd-second-local-disk]: Dependency Exec[create-volume-group] has failures: true
Warning: /Stage[main]/Role::Labs::Lvm::Mnt/Labs_lvm::Volume[second-local-disk]/Exec[create-vd-second-local-disk]: Skipping because of failed dependencies

Using parted on /dev/vda there is still 42GB of free space:

  1. /sbin/parted /dev/vda print free Model: Virtio Block Device (virtblk) Disk /dev/vda: 42.9GB Sector size (logical/physical): 512B/512B Partition Table: msdos

    Number Start End Size Type File system Flags 1 32.3kB 8191MB 8191MB primary ext4
8191MB  8191MB  476kB            Free Space
2      8191MB  10.2GB  2048MB  primary  ext4
3      10.2GB  12.3GB  2048MB  primary  ext4
4      12.3GB  14.3GB  2048MB  primary  linux-swap(v1)
14.3GB  42.9GB  28.6GB           Free Space

Could it be an issue in make-instance-vg? It invokes:

/sbin/parted /dev/vda print free|fgrep 'Free Space'|tail -n 1|sed -e 's/ */ /g'|cut -d ' ' -f 2,3

Which returns:

14.3GB 42.9GB

And thus invoke:

/sbin/parted -s /dev/vda mkpart primary 14.3GB 42.9GB

The bug prevent us from finalizing the creation of deployment-cxserver02.eqiad.wmflabs which is the content translation server on the beta cluster (bug 71783).


Version: unspecified
Severity: critical

Details

Reference
bz71873

Event Timeline

bzimport raised the priority of this task from to Medium.Nov 22 2014, 3:59 AM
bzimport added a project: Cloud-VPS.
bzimport set Reference to bz71873.
  • Bug 71874 has been marked as a duplicate of this bug. ***

The same thing has been happening since last week on integration-slave1009. I never got to fully set up that new instance because it's puppet run failed from the very beginning.

The new partitioning recipe uses all four possible primary partitions, leaving none for LVM to take. I'll fix the image generation today, but instances created with that recipe cannot use LVM without manual tweaking.

Dammit, that new image is giving me no end of trouble. Marc, you're going to set up the new image to use lvm for all its partitions?

For the record, I initially thought this was only affecting Trusty instances, but it's affecting Precise images as well. integration-slave1004 (newly created, Precise) is unable to complete its puppet run properly also with the same error.

gerritadmin wrote:

Change 166139 had a related patch set uploaded by coren:
Labs: Make images create LVM at firstboot.sh

https://gerrit.wikimedia.org/r/166139

gerritadmin wrote:

Change 166139 merged by Andrew Bogott:
Labs: Make images create LVM at firstboot.sh

https://gerrit.wikimedia.org/r/166139

This should now be resolved for new instances. Note that the new partition scheme will make the /mnt partition slightly smaller as /var/log now has its own 2gb partition.