Page MenuHomePhabricator

ParsoidCacheUpdateJob queues htmlCacheUpdate jobs on template edits
Closed, ResolvedPublic

Description

Every edit results in a ParsoidCacheUpdateJob. Every time a template is edited which has more than 20 pages that use it, ParsoidCacheUpdateJob purges the cache of those pages by queueing htmlCacheUpdate jobs.

This is because ParsoidCacheUpdateJob extends from HTMLCacheUpdateJob. It calls doFullUpdate(), which is not overridden, which calls insertPartitionJobs(), which is also not overridden, which inserts the relevant htmlCacheUpdate jobs. I've confirmed this with strace/eval.php in production.

HTMLCacheUpdateJob was not intended for subclassing. Trying to override almost every method of a class in order to use the remaining behaviour is error-prone. I think you should either derive directly from Job (with some duplication), or factor out an abstract base class which would be common to both HTMLCacheUpdateJob and ParsoidCacheUpdateJob.


Version: unspecified
Severity: major

Details

Reference
bz51156

Event Timeline

bzimport raised the priority of this task from to High.Nov 22 2014, 1:43 AM
bzimport added a project: Parsoid.
bzimport set Reference to bz51156.

Ahh, this confirms the suspicion that I had about why larger template updates never make it to Parsoid. Thank you for figuring this out!

Factoring out the general range splitting functionality from HTMLCacheUpdate and RefreshLinks is something Aaron plans to do afaik. In the meantime I'll either duplicate the range splitting code, or find some other way to make it work while still subclassing.

Change 73368 had a related patch set uploaded by GWicke:
Bug 51156: Don't subclass HTMLCacheUpdate any more

https://gerrit.wikimedia.org/r/73368

Change 73368 merged by jenkins-bot:
Bug 51156: Don't subclass HTMLCacheUpdate any more

https://gerrit.wikimedia.org/r/73368

The fix should be deployed tomorrow.