Page MenuHomePhabricator

refreshLinks jobs not queued on template deletion
Closed, ResolvedPublic

Description

Author: xkernigh

Description:
Suppose page [[A]] transcludes page [[B]] (for example, with {{:B}}). In
turn, page [[B]] contains [[Category:C]]. Thus both A and B are in
category C, and the page for category C lists both A and B. That is
normal. Now (as a sysop user) delete page B. This removes the
[[Category:C]] link from A. However, the category C page continues to
list page A as being in that category, even though page A itself is not
in the category. The workaround is to edit page A.

Steps to reproduce the bug at en.wikibooks.org:

(1) I create a page at [[b:en:User:Kernigh/delete me]]. I categorise the
page into [[b:en:Category:Candidates for speedy deletion]], the SD
category.
(2) On the [[b:en:User:Kernigh/sandbox]] page, I write
{{User:Kernigh/delete me}}. Because the sandbox transcludes the "delete
me", and the "delete me" is in the SD category, the sandbox is now also
in the SD category.
(3) As sysop, delete [[b:en:User:Kernigh/delete me]]. This removes the
"delete me" page from the SD category. Also, a visit to
[[b:en:User:Kernigh/sandbox]] shows that the sandbox is not in the SD
category.

Bug: At this point, when I look in [[b:en:Category:Candidates for speedy
deletion]], the page [[b:en:User:Kernigh/sandbox]] is in the list. The
bug appears even if I open the SD category in Firefox instead of
Konqueror; the bug is at the server, not my browser cache.

Expected result: the sandbox should not be listed in the SD category,
because the sandbox itself is not in the category and contains no
category tag.

Workaround: Using action=purge does not work. However, if I edit the
sandbox, MediaWiki notices that the new version does not contain an SD
category tag, and removes the sandbox from the SD category.

(This problem appeared at en.wikibooks.org because of a reorganisation
in our C++ book. We are moving the book and all of its subpages from
[[b:en:Programming:C plus plus]] to [[b:en:C++ Programming]]. We are
also trying a new GUI version of the book. Each "C++ Programming/GUI"
page, for example [[b:en:C++ Programming/GUI/About this book/Foreword]],
transcludes another page, like [[b:en:C++ Programming/Foreword]], and
adds some extra navigation links to other parts of the book.)

(We accidentally copied some pages from the old "C plus plus" names to
the new "C++" names instead of moving them. We then decided to mark the
new copies for speedy deletion, to make room for page moves. Some of us
are not sysops, so we tagged the pages with [[b:en:Category:Candidates
for speedy deletion]]. (Actually, we used [[b:en:Template:Delete]],
which contains the category tag, but the effect is the same.) This
marked the new copies for deletion. It also accidentally marked some GUI
pages for deletion, because the GUI pages transclude the new copies and
thus the marks.)

(When we deleted the new copies, the GUI pages remained in the list at
[[b:en:Category:Candidates for speedy deletion]], even though the GUI
pages are not in the category; they do not contain the category tag
anywhere. This is a minor bug, because other than having 20-30 pages
wrongly listed on a category page, there appears to be no problem.)


Version: 1.18.x
Severity: major
URL: http://en.wikibooks.org/wiki/Category:Candidates_for_speedy_deletion

Details

Reference
bz5382

Event Timeline

bzimport raised the priority of this task from to Low.Nov 21 2014, 9:10 PM
bzimport set Reference to bz5382.

snottygobble wrote:

The same problem manifests itself when a template's category is changed: pages
that transclude the template are listed as belonging to the new category, but
actually remain listed on the category page of the old category.

ayg wrote:

They should change category eventually, once the job queue is finished running.
If I'm wrong, reopen.

*** This bug has been marked as a duplicate of 6389 ***

snottygobble wrote:

Wikipedia's job queue length is currently
[http://en.wikipedia.org/wiki/Special:Statistics zero], yet I'm still seeing
this problem. The template that tags articles into [[:Category:Redirects from
alternate languages]] was changed to tag articles into [[:Category:Redirects
from alternative languages]] more than 5 days ago, but the articles are still
listed in the former category.

morten wrote:

The problem has further complications. The old template was moved from
[[en:Template:R from alternate language]] to [[en:Template:R from alternative
language]]. And then the moved template was updated. Thus, there might be a
problem, where the job queue is not added pages, that transcludes redirects to
templates, that are updated. I believe this is the core issue here.

cyp wrote:

Rob: couldn't FormattableDate() take $wgLanguageCode and $wgAmericanDates into consideration when selecting a "default" format?
i.e.

default: 
     if ($wgLanguageCode !== 'En' || $wgAmericanDates == false) {
         if ($wgLanguageCode == 'Jp' || $wgLanguageCode == 'Kr')
             return 'Y F j';
         return 'F j, Y';
     }
     return 'Y-m-d';

or some such? That would at least take care of the "unpretty" issue for all the non-en wikis.

cyp wrote:

whoops. sorry. wrong bug (also apologies for not being able to figure out how to strike my last comment in place)

achuggard wrote:

*** Bug 13053 has been marked as a duplicate of this bug. ***

ayg wrote:

*** Bug 12004 has been marked as a duplicate of this bug. ***

mashiah.davidson wrote:

I see pretty similar things with links originated from transcluded something. Once transcluded page got links modified, old links do exist for some time then. But interesting, got gone during a while (weeks for :ru). It does depend on job queue as well, means we do not have it above zero during weeks.

mashiah.davidson wrote:

Sorry, I meant to say It DOES NOT depend on job queue.

mike.lifeguard+bugs wrote:

Possibly noted already above, but this error occurs when a parserfunction in a template changes the category for a page transcluding it. Purging the cache of the page, the template, or the category; nor waiting for the job queue to empty will update the category page properly, though the correct category appears in the bar at the bottom of the page.

mike.lifeguard+bugs wrote:

(In reply to comment #11)

the correct category appears in
the bar at the bottom of the page.

This bit is the confusing part. The proper category in the category bar is shown, but the page isn't displayed on the category page! Does anyone have any idea why that would happen? I could understand if the category simply didn't get updated, but this is strangeness.

Yes, same with me. The category is on the page, the page is not on the category.

I have just did a nulledit (edit, no changes, comment "nulledit", Save button) on

http://pl.wikipedia.org/wiki/Szablon:Koordynaty/Linkuj

and it appeared in this category:

http://pl.wikipedia.org/wiki/Kategoria:Szablony_robocze_koordynat%C3%B3w

Secondly, I did a similar nulledit on

http://pl.wikipedia.org/wiki/Szablon:Lt

and two templates:

http://pl.wikipedia.org/wiki/Szablon:Lt
http://pl.wikipedia.org/wiki/Szablon:Lx

appeared in the category:

http://pl.wikipedia.org/wiki/Kategoria:Szablony_tworz%C4%85ce_zbi%C3%B3r_link%C3%B3w_do_obs%C5%82ugi_strony

Probably because {{Lt}} uses {{Lx}}.

sumanah wrote:

I just tested this on a wiki running MediaWiki 1.18.0 and can still reproduce the undesired behavior via the instructions in xkernigh's original description.

Same thing happens with links, image links, and template transclusions (all shown by Special:WhatLinksHere). I haven't tested external links,
interwiki links, or language links, but probably them too.

I think the underlying cause is that the page is properly queued for purge when the template is edited, but the purge doesn't update any of the links tables.

Probably bug 37001, bug 31577, bug 31628, and bug 18478 are all the same issue.

  • Bug 46281 has been marked as a duplicate of this bug. ***

So, welcome 2006. This is the future speaking.

Still in 2013 when a page is purged, we don't perform the same updates as when actually re-saving the page (at least a null edit).

These are currently known as "secondary updates" and include updates to link tables (image links, categories, page links, interwiki links) and page properties (hidden cat, displaytitle, templatedata).

Which means whenever an edit is made to a page that will result in a change in secondary data (links and/or page properties) to a trascluding page, the update doesn't happen.

Note I'm not saying the update is deferred, it doesn't happen at all. We already defer our primary updates trough the job queue (which is working mostly fine).

To add more:

The community has (rightfully) decided some years ago that the absence of secondary updates is unacceptable and wrote a bot that does a forced link update to a long list of pages that are known to manifest this bug in a more visible matter.

So the argument for performance can't really stand much in my opinion because the updates are happening either way, the choice is whether we want to do it in core and automatically or wait for a bot to come by and do it in a less efficient way.

From looking at the code base we currently run the secondary data updates immediately for the edited page and recursively schedule jobs for all pages transcluding the edited page (e.g. Template:Foo transcluding Template:Foo/doc, where the latter is being edited).

It doesn't have a parameter to skip some updates either, they always run all secondary data updates (page properties, the various link tables etc.).

So then what causes page properties like templatedata to not be updated (bug 50372)?

The original report is pretty clear, and does indeed reflect what is in the code, now and historically, so I've changed the bug summary accordingly: "refreshLinks jobs not queued on template deletion".

WikiPage::doDeleteUpdates() ensures referential integrity in the links tables, but does not trigger re-parsing for pages that use the deleted page as a template.

(In reply to comment #1)

The same problem manifests itself when a template's category is changed

This is unrelated. It was probably just job queue lag.

(In reply to comment #18)

Still in 2013 when a page is purged, we don't perform the same updates as
when actually re-saving the page (at least a null edit).

This is unrelated.

Which means whenever an edit is made to a page that will result in a change
in secondary data (links and/or page properties) to a trascluding page, the
update doesn't happen.

By analysing the test case which gave rise to this complaint, this was confirmed to be just job queue lag.

Change 93980 had a related patch set uploaded by Anomie:
Add a RefreshLinks job when a template is deleted

https://gerrit.wikimedia.org/r/93980

Change 93980 merged by jenkins-bot:
Add a RefreshLinks job when a template is deleted

https://gerrit.wikimedia.org/r/93980

This particular bug is fixed. Other similar bugs, such as bug 18478, are not necessarily fixed.

The change should be deployed to WMF wikis with 1.23wmf3, see https://www.mediawiki.org/wiki/MediaWiki_1.23/Roadmap for the schedule.