Page MenuHomePhabricator

Bring limn hosts back online
Closed, ResolvedPublic

Description

All seem down, perhaps due to recent Labs outage?

(from http://lists.wikimedia.org/pipermail/analytics/2013-December/001315.html)


Version: unspecified
Severity: blocker

Details

Reference
bz57822

Event Timeline

bzimport raised the priority of this task from to Needs Triage.Nov 22 2014, 2:35 AM
bzimport set Reference to bz57822.
bzimport added a subscriber: Unknown Object (MLST).
  • I powered limn0.pmtpa.wmflabs.

That brought the following domains back online:

debugging.wmflabs.org
dev-reportcard.wmflabs.org
ee-dashboard.wmflabs.org
gerrit-stats.wmflabs.org
gp.wmflabs.org
mobile-reportcard.wmflabs.org
mobile-reportcard-dev.wmflabs.org
test-reportcard.wmflabs.org

and reran missed cronjobs for user limn, so gp.wmflabs.org has current data again.

  • reportcard.wmflabs.org is still down.

Could not find information about it on wikitech, DNS points to 208.80.153.208,
which is not listed in the analytics project on labs.
Which project is 208.80.153.208 on?

  • Are there other limn instances that are still down?
  • reportcard.wmflabs.org is still down.

Could not find information about it on wikitech,

Where would people expect to find this type of information? I would love to add documentation but I just don't know where.

DNS points to 208.80.153.208,
which is not listed in the analytics project on labs.
Which project is 208.80.153.208 on?

This IP points to the instance called "reportcard" in the "reportcard" project. I've rebooted it and it's serving reportcard.wmflabs.org properly.

  • Are there other limn instances that are still down?

None that I know of, though other people have tried to host their own in the past. It definitely seems like some type of heartbeat monitoring would be nice as a configurable on limn instances.

Searches linked below started showing .208 after it was rebooted and I just booted another instance and now its IP shows up too.

virt10 was booted around 14 UTC and it happens that reportcard, limn0 and the instance I just booted are all on virt10.

(In reply to comment #2, written before reportcard was booted)

Which project is 208.80.153.208 on?

I see a number of instances missing from searches like this:
https://wikitech.wikimedia.org/w/index.php?title=Special:Ask&p=format%3Dbroadtable%2Flink%3Dall%2Fheaders%3Dshow%2Fsearchlabel%3D%E2%80%A6-20further-20results%2Fclass%3Dsortable-20wikitable-20smwtable&po=%3FInstance+Name%0A%3FProject%0A%3FFQDN%0A%3FPublic+IP%0A&eq=no&q=[[Resource+Type%3A%3Ainstance]][[Public+IP%3A%3A208.80.153.202]]

aka http://ur1.ca/g4pof

Substitute a different IP instead of 208.80.153.202 and you may get something useful or might get an empty set.

I can't say for sure what the normal behavior is though.

This should be (I guess) a list of all instances with IPs and those IPs:
https://wikitech.wikimedia.org/w/index.php?title=Special:Ask&q=[[Resource+Type%3A%3Ainstance]][[Public+IP%3A%3A!None]]&p=format%3Dbroadtable%2Flink%3Dall%2Fheaders%3Dshow%2Fsearchlabel%3D%E2%80%A6-20further-20results%2Fclass%3Dsortable-20wikitable-20smwtable&po=%3FInstance+Name%0A%3FProject%0A%3FFQDN%0A%3FPublic+IP%0A%3FInstance+Host%0A&limit=100&eq=no

aka http://ur1.ca/g4pq5

(In reply to comment #3)

  • reportcard.wmflabs.org is still down.

Could not find information about it on wikitech,

Where would people expect to find this type of information? I would love to
add documentation but I just don't know where.

https://wikitech.wikimedia.org/wiki/Nova_Resource:Reportcard has an "Edit Documentation" link.

You could create https://wikitech.wikimedia.org/wiki/reportcard.wmflabs.org or https://wikitech.wikimedia.org/wiki/reportcard (link them to eachother if there's a reason for both to exist otherwise redirect one to the other)

You could also make a directory of limn instances at https://wikitech.wikimedia.org/wiki/Limn (but that would just be links to other pages. so e.g. the details about where a specific instance runs would not be on this page)

It definitely seems like some type of heartbeat monitoring would be
nice
as a configurable on limn instances.

You mean a way to phone home to tell the developers about me? a la http://popcon.debian.org/

Are other projects besides mediawiki currently doing that? maybe some infrastructure could be shared?

(In reply to comment #4)

(In reply to comment #2, written before reportcard was booted)

Which project is 208.80.153.208 on?

[sample queries]

That's super useful! \o/
Thanks a lot.

[moving tickets as per bug 65903]