Page MenuHomePhabricator

instances can not boot / reboot anymore
Closed, ResolvedPublic

Description

appeared this afternoon (Europe) with an instance I rebooted (and then deleted) and with the fresh instance i-0000034b :

cloud-init running: Wed, 18 Jul 2012 13:47:07 +0000. up 5.13 seconds
waiting for metadata service at http://169.254.169.254/2009-04-04/meta-data/instance-id

13:47:10 [ 1/100]: url error [timed out]
13:47:13 [ 2/100]: url error [timed out]
13:47:16 [ 3/100]: url error [timed out]
13:47:19 [ 4/100]: url error [timed out]

So we can not reboot nor access any new instance.


Version: unspecified
Severity: critical

Details

Reference
bz38473

Event Timeline

bzimport raised the priority of this task from to Medium.Nov 22 2014, 12:47 AM
bzimport set Reference to bz38473.

reverting subject back. That happens on instance boot so affect both existing and newly created instances.

Mail by Ryan Lane in labs-l :

OpenStack Nova does some inefficient queries for searching for
instances, especially when pulling metadata information. When
instances are created, we inject what's called "userdata". That
userdata is used by a service in ubuntu called cloud-init. cloud-init
pulls this userdata, along with the instance's metadata and does the
initial bootstrapping of the system.

For us, cloud-init, using the userdata, installs puppet, points it at
our puppet server and does a full puppet run. This process is
currently failing, as cloud-init thinks the metadata server is timing
out. Apparently other deployments of openstack are having performance
issues with metadata that are due to the same inefficient query.

There's two things we'll be doing about this:

  1. We're working on a fix with the other openstack devs
  2. We'll purge deleted instances from the database. Nova keeps them

for auditing purposes, but they are unneeded. This will bring the
query speed down to a level that is below the cloud-init threshold.
We're also waiting for another organization to push their solution for
this upstream, rather than mucking around the database ourselves.

Sorry for the inconvenience,

  • Ryan

Per above, Ryan is working on it with OpenStack devs.