Page MenuHomePhabricator

Slowness in accessing English Wikipedia
Closed, ResolvedPublic

Details

Reference
bz29034

Event Timeline

bzimport raised the priority of this task from to High.Nov 21 2014, 11:35 PM
bzimport set Reference to bz29034.
bzimport added a subscriber: Unknown Object (MLST).

Created attachment 8553
Traceroute outputs

It's not as bad today as it was last night, but I'm still getting 30s page load times on pages a week or two ago were loading in 2-3s.

Here's a couple of tracert outputs for you, first for en.wikipedia.org, then bits.wikimedia.org

Attached:

slimvirgin wrote:

It took me 15 minutes to make one edit today. First, I couldn't get the page to open, then preview wouldn't work, then when I tried to save I got numerous error messages.

This started for me on May 16 when I couldn't get into Wikipedia at all for about an hour, then some pages opened, but my watchlist wouldn't. Then when it did open, the page loaded with bits at the top missing.

Early today was slightly faster, then a few hours ago it went back to being very slow again.

This could have a million and one different causes.

  • Servers could've had a cache problem
  • Could be related to geolocation and nearest datacenter
  • Problems at your provider
  • Anything on your computer
  • Anything in between.

Marking INVALID.

See also "[Wikitech-l] Large number of users unable to access WMF wikis" from earlier this month.[1]

Krinkle

[1] http://lists.wikimedia.org/pipermail/wikitech-l/2011-May/

The issue is real, and affecting many people.

Ignoring it is not nice towards the many users who are experiencing it.

monkeykokoko wrote:

This issue is affecting people in all parts of the US, in the UK, Australia, and Hong Kong, and those are only the places I know about who have spoken out at the Village Pump. I don't think it has to do with geolocation. It has been happening for at least a week. I've taken a break from editing Wikipedia because it is just too frustrating at this point.

bidgee-wiki wrote:

I'm having the very same issues (in Australia), at times you only get the background image loaded or just half the page. I'm not having the issues on Commons, just Wikipedia. Just to get a page fully loaded sometimes takes five or more refreshes. Frustrating is an understatement, but this has been occurring for the last four to five days.

[Also posting to Bugzilla]

According to the ops team, there are a number of separate and unrelated ops issues that have come up in the last few days:

  1. Not all users are experiencing slowness, but a subset of users are. There's no definite smoking gun, but the most likely cause are ongoing issues with one of our routers in Tampa. The router will have to be taken down for maintenance to fix this issue, and order to perform this maintenance operation with minimal disruption, we need to have key ops engineers on standby to deal with any issues that may arise. My understanding is that the best available maintenance window is Tuesday next week.
  1. There was a software deployment on May 18 which caused an application server overload; it was reverted the same day.
  1. The mobile servers are currently intermittently overloaded, throwing internal server errors, and servers to provide additional capacity have been racked today.
  1. In case you're looking at it, ganglia.wikimedia.org is not displaying correct server status information (as of yesterday); it's in the process of being fixed.

We're still in the process of setting up a new primary data center location in Ashburn, VA, which will give us higher site reliability in general, and also create the possibility of safe failover in maintenance or emergency situations.

ebe123_wiki wrote:

This is the cause that I found. It is that wikipedia's bandwidth is getting closer and closer to the maximum. It makes that page load is taking more time and edits are even harder.

ebe123_wiki wrote:

(In reply to comment #5)

This issue is affecting people in all parts of the US, in the UK, Australia,
and Hong Kong, and those are only the places I know about who have spoken out
at the Village Pump. I don't think it has to do with geolocation. It has been
happening for at least a week. I've taken a break from editing Wikipedia
because it is just too frustrating at this point.

Hey! Globalize! Canada is affected too.

[[wikitech:Planned Maintenance 24May2011]]: «The team performed the operation successfully - the upgrades and reboots on those network gears went smoothly and the servers and sites came back up on scheduled. We believe we have also addressed the lag in the page load caused by the network routing problem.»
And nobody is complaining any longer.