Page MenuHomePhabricator

Handle thumbnail purges when thumbnails are in cache but not in the backend
Open, MediumPublic

Description

See proposals on bug 41130.

Purge requests are based on the file in the backend, not those in the cache. If they fall out of sync (like if a prior purge is partially completed), then there can be unpurgable images in the cache.

One solution is to use a varnish module to hash all thumbnails for an original version of a file to the same place so that they can all be purged at once. This could be based on a thumbnail url regex, since the source file is the parent directory of the thumbnail.

If a file had 1 current and 2 old versions, the thumbnails would be mapped to 3 different hashes for that file, one for each version. Purge requests could be issues for each of the original versions on ?action=purge (or file deletion and other events) which purge all thumbnails for those 3 versions regardless of what thumbnails are in the backend vs cache.


Version: unspecified
Severity: normal
See Also:
https://bugzilla.wikimedia.org/show_bug.cgi?id=41130
https://bugzilla.wikimedia.org/show_bug.cgi?id=48927

Details

Reference
bz44428

Event Timeline

bzimport raised the priority of this task from to Medium.Nov 22 2014, 1:17 AM
bzimport set Reference to bz44428.
bzimport added a subscriber: Unknown Object (MLST).

Of course some MediaWiki support will be needed for this in addition to varnish and ops work (re-hashing requires phased mass purging).

This approach would have problems if the varnish hash-chain gets too long (many thumbnails for one file), which would require using a different internal structure.

An easier alternative for the user-facing problems that has been floating is versioned URLs (and making backlink invalidation work for Commons).

(In reply to comment #2)

This approach would have problems if the varnish hash-chain gets too long
(many
thumbnails for one file), which would require using a different internal
structure.

An easier alternative for the user-facing problems that has been floating is
versioned URLs (and making backlink invalidation work for Commons).

That would also maybe speed things up since then the browser might not have to check if image was modified.

Would need to remember to make sure this doesnt break instant commons. But that already stores files locally, so probably would be ok.