Page MenuHomePhabricator

[Regression] Link tables (categories, file links, templates) are being purged for js/css pages
Closed, ResolvedPublic

Description

Krinkle added

// {{delete|AjaxPatrolLinks is obsolete, feature is now in MediaWiki core}}

to
https://meta.wikimedia.org/wiki/MediaWiki:Gadget-AjaxPatrolLinks.js
in 13:33, 10 December 2012 and since it was not deleted, I assumed it was because of the template not being parsed, so I added

// [[Category:Deleteme|Category:Deleteme]]

and the page still din't show up in [[meta:Category:Deleteme]].
I confirmed the same problem on ptwikibooks.

This worked previously, per T34858#374493, so it is a regression (and I assume it is something related to ContentHandler).


Version: master
Severity: major

Details

Reference
bz68757

Event Timeline

bzimport raised the priority of this task from to High.Nov 22 2014, 3:29 AM
bzimport added a project: SyntaxHighlight.
bzimport set Reference to bz68757.

Yep, GlobalUsage table is rapidly shrinking for script and style pages. Existing entries are there still, but as pages get purged, edited or expire from parser cache, stuff is disappearing.

Aside from wiki workflows where templates are used for deletion requests, and categories for organisation of scripts, this is also affecting tools and gadgets that use file links to delegate tracking of scripts in global usage tables.

For example this query:
https://tools.wmflabs.org/usage/?action=usage&group=Krinkle

Is shrinking continuously (used to have 1000+, now only 890 left)

en.wikipedia.org is also affected. So its probably a regression from 1.24wmf14 or 1.24wmf13.

The problem with this bug is that these links are often used to find things. So the bug caused those signals to no longer occur. It takes a while for people to notice it's broken (as opposed to there simply not being any signals). And past signals (current categorisation) only slowly disappears from the database as caches expire. This will become a bigger problem as time goes on.

I note that something like api.php?format=jsonfm&action=parse&title=MediaWiki:Common.js&text=[[Category:Foo]] can be used to quickly check the bug.

git bisect against core turns up gerrit change 67983 as the culprit; the root of the problem seems to be that the hook function enabled in gerrit change 131447 doesn't bother with the special handling done for $wgTextModelsToParse in includes/content/TextContent.php.

I'd suggest any other patches done to support 67983 should be checked for similar problems.

(In reply to Brad Jorsch from comment #3)

I note that something like
api.php?format=jsonfm&action=parse&title=MediaWiki:Common.js&text=[[Category:
Foo]] can be used to quickly check the bug.

git bisect against core turns up Gerrit change #67983 as the culprit; the
root of the problem seems to be that the hook function enabled in Gerrit
change #131447 doesn't bother with the special handling done for
$wgTextModelsToParse in includes/content/TextContent.php.

I'd suggest any other patches done to support 67983 should be checked for
similar problems.

Thanks for digging into this, Brad :)

(In reply to Brad Jorsch from comment #3)

git bisect against core turns up Gerrit change #67983 as the culprit; the
root of the problem seems to be that the hook function enabled in
Gerrit change #131447 doesn't bother with the special handling done for
$wgTextModelsToParse in includes/content/TextContent.php.

Indeed, this is only happening when the SyntaxHighlight_GeSHi extension is installed. Changing component accordingly. Let's see how hard this is to fix…

Change 150630 had a related patch set uploaded by Bartosz Dziewoński:
Parse page content using the standard parser first for link tables

https://gerrit.wikimedia.org/r/150630

Ha, wasn't that hard. It's a bit silly that we have to reimplement this logic in the extension, though.

(In reply to Brad Jorsch from comment #3)

I'd suggest any other patches done to support 67983 should be checked for
similar problems.

I am not aware of any changes to any other extensions related to that.

(In reply to Bartosz Dziewoński from comment #7)

(In reply to Brad Jorsch from comment #3)

I'd suggest any other patches done to support 67983 should be checked for
similar problems.

I am not aware of any changes to any other extensions related to that.

Quick grep suggests that ContentGetParserOutput is only used by GeSHi :)

Change 150630 merged by jenkins-bot:
Parse page content using the standard parser first for link tables

https://gerrit.wikimedia.org/r/150630

Change 150723 had a related patch set uploaded by Legoktm:
Parse page content using the standard parser first for link tables

https://gerrit.wikimedia.org/r/150723

Change 150723 merged by jenkins-bot:
Parse page content using the standard parser first for link tables

https://gerrit.wikimedia.org/r/150723

Fix deployed for 1.24wmf15 today (https://commons.wikimedia.org/w/api.php?format=jsonfm&action=parse&title=MediaWiki:Common.js&text=[[Category:Foo]]), will hit all other wikis during the 1.24wmf15/16 deployment tomorrow.

He7d3r added a project: Regression.
He7d3r set Security to None.