Page MenuHomePhabricator

"Page does not exist" on some pages on MW 1.18; possible empty page_latest bug?
Closed, ResolvedPublic

Description

screenshot of "there is no page" for page that should exist

I was browsing today on MediaWiki.org and got a "There is currently no text in this page. You can search for this page title in other pages, search the related logs, or edit this page. " For a page that definitely existed/had content (you could see it in the history view).

I purged it, and the page's content re-appeared. I then hit special:random a bunch of times to see if this was a one-time issue or if more then one page was affected. Within about 10 special:random hits, I got another "missing" page (screenshot attached) which would seem to indicate that the problem is affecting quite a few pages...

I see ES mentioned a lot in the sidebar bug (bug 31100) so possibly related(?)


Version: 1.18.x
Severity: normal

Attached:

mw-page-no-exist.png (800×1 px, 243 KB)

Details

Reference
bz31179

Event Timeline

bzimport raised the priority of this task from to Unbreak Now!.Nov 21 2014, 11:53 PM
bzimport set Reference to bz31179.
bzimport added a subscriber: Unknown Object (MLST).

Sidebar bug 31179 (went ahead and split it out) is possibly related to External Storage intermittent read failures, though we have not determined that for certain.

I don't think the same thing would cause this though; it looks like the [[MediaWiki:noarticletext]] message only comes out for non-existent pages (no page table record); if you had an ES fail grabbing the text it would still think the page exists, and would come back showing you empty page contents.

Actually.... it would display that way.

Down in Article::view() there's a check for empty content on pass 2 (after checking the parser cache) which looks for false (bogus/errored) text and then calls Article::showMissingArticle() which whips that message back out.

So this could be plausible. :D

  • Bug 31288 has been marked as a duplicate of this bug. ***

These pages come with page_latest=0 Seems the same issue reported in wikitech-l
breaking pywikipediabot. Looks like a 1.18 regression.

Could be; sounds like an editing/saving bug...?

I can't seem to find any rows in the DB with page_latest = 0 for MW.org.

  • Bug 31312 has been marked as a duplicate of this bug. ***

The page [[mw:Extension:Firefox_toolbar/fr/UI]] originally cited in this bug report doesn't appear to have been edited since 2007 and doesn't _appear_ to have page_latest=0 (at least from what I can see outside).

It might not be the same problem as the ones that are reporting complete breakage, or it might have been 'incomplete' along the way.

Data points on nl.wikipedia.org:

[[nl:Ben_Tiggelaar]] -- apparently had page_latest=0, has been fixed manually?

[[nl:User_talk:RM21/Overleg]] - page record with no live revisions (partial deletion in 2007?). Was deleted/restored a couple times in the past.

[[nl:Blankenbach]] - apparently has/had page_latest=0 but lots of other activity. Has been deleted and undeleted twice in last few days.

on simple.wikipedia.org:

[[simple:Deal_or_No_Deal_UK]] - a redirect stub created 27 september, no other history

Recording pairs of page_ids and titles here, so we can look at them later. These are all page_latest=0 and page_is_new=1.

ptwiki:
2212847 201.86.189.213
2212848 189.4.181.187
2212849 189.10.252.194
2213433 201.35.181.75
all these are namespace 3, all with timestamps 20090415... or 20090416...

ruwiki:
1936451 Гинько,_Елена_Валерьевна
2475302 Kalashist

eswiki:
2343377 190.134.174.70
2355993 200.64.55.191
2438562 200.50.8.74
3079560 186.9.18.135
all namespace 3

enwiki:
19113987 1r3gr37n0n
19399178 Vd437
20513897 Heelo1
21622712 88.107.34.0
22427099 98.108.121.19
22427147 72.227.225.75
22427162 68.59.212.62
22427179 146.186.59.74
22427194 HOTPOCKETSG
22427196 70.181.94.68
22427202 Jaeh0317
22441341 24.78.158.115
23391967 24.143.15.213
23398229 96.250.7.85
23400278 173.79.110.150
23994255 68.91.91.22
24090446 Mightym53821
all ns 3 as well. 8 of these have the timestamp 20090415 or 20090416. the rest are scattered about including one from 20010902

(In reply to comment #9)

Data points on nl.wikipedia.org:

[[nl:Ben_Tiggelaar]] -- apparently had page_latest=0, has been fixed manually?

[[nl:User_talk:RM21/Overleg]] - page record with no live revisions (partial
deletion in 2007?). Was deleted/restored a couple times in the past.

[[nl:Blankenbach]] - apparently has/had page_latest=0 but lots of other
activity. Has been deleted and undeleted twice in last few days.

Yeah, I fixed the first one.

Here is a problematic scenario:

Situation: page restored when no live page already exists at that title

  1. SpecialUndelete::undeleteRevisions()
  2. SpecialUndelete::undeleteRevisions() does:

$newid = $article->insertOn( $dbw );
A new page row is inserted with page_latest=0, page_len=0, page_is_new=1
This also sets mTitle in the WikiPage to have correct ID.

  1. SpecialUndelete::undeleteRevisions() does:

$oldcountable = $article->isCountable();

  1. isCountable() calls isRedirect(), which causes loadPageData() to be done on a slave.

The data isn't there yet on the slave, so the page is loaded as not existing.
loadPageData() does:

		$this->mTitle->loadFromRow( false );

...which overrides mTitle in the WikiPage to 0

  1. SpecialUndelete::undeleteRevisions() does:

$article->updateIfNewerOn( $dbw, $revision, $previousRevId ),
Called with $previousRevId = 0 since no page existed at the title before restoring

  1. WikiPage::updateIfNewerOn() calls WikiPage::updateRevisionOn() since the page doesn't exist
  2. WikiPage::updateRevisionOn() does:

$conditions = array( 'page_id' => $this->getId() );
$this->getId() uses mTitle->getArticleID(), which was corrupted as 0.
Thus, the UPDATE fails to update the row in (2), and it is stuck with those values

Good catch -- this looks like it could plausibly have been breaking pages off and on for a while (hence some of the older cases that aren't new).

Probably Article::insertOn (rather WikiPage::insertOn) should save the updated ID and whatnot and mark itself as having loaded state (set $this->mDataLoaded). This'll bypass trying to reload the data, without having to explicitly say "btw load this from master".

Re-opening as some instances have come up still.

Closing again. This bug originally referred to pages with several edits and broken page_latest values involving deletion/restoration.

The "new instances" were actually an issue with new pages rather
than existing ones. They were caused by...a bug in logging hack intended to confirm the absence of *this* bug, somewhat ironically. That code was removed a script was run to clean up all effected pages.