Page MenuHomePhabricator

Thumbnails of archived images don't get deleted
Closed, ResolvedPublic

Description

Author: russnelson

Description:
Upload an image to a MediaWiki instance.
Replace it with a different image.
The page you get back lists both the current and archived version, and 120px thumbnails for both images.
Delete the archived version.
The original is gone, but the thumbnail remains.
I found this in r84671, but AaronSchulz was able to reproduce it in a current version.
Could this be related to b28613 ?


Version: 1.20.x
Severity: normal

Details

Reference
bz30192

Event Timeline

bzimport raised the priority of this task from to Medium.Nov 21 2014, 11:49 PM
bzimport set Reference to bz30192.
bzimport added a subscriber: Unknown Object (MLST).

Can you reproduce this at will? I.e. if I follow these steps does the thumbnail always stick around?

russnelson wrote:

Yes, and on a completely vanilla MediaWiki install. So it has nothing to do with networking and nothing to do with caching. At least this bug anyway. Can't say that about b28613.

Unless I'm missing something, Bug 28613 doesn't seem to be talking about removing old versions, though.

saibotrash wrote:

A test case: http://de.wikipedia.org/wiki/Datei:Logo_African_Pygmy_Goat.png

uploaded a new image ("test copyvio") over this file.

after reverting the second (copyvio) version had the thumburl: http://upload.wikimedia.org/wikipedia/de/thumb/archive/5/51/20110803213020!Logo_African_Pygmy_Goat.png/120px-Logo_African_Pygmy_Goat.png

I also accessed this thumb: http://upload.wikimedia.org/wikipedia/de/thumb/archive/5/51/20110803213020!Logo_African_Pygmy_Goat.png/126px-Logo_African_Pygmy_Goat.png


then I did revision delete (hide) the file contents of the second version: Both above mentioned thumb URLs still work. → Bug.
Thumb URLs which were not accessed while the file version was visible do not work (example: http://upload.wikimedia.org/wikipedia/de/thumb/archive/5/51/20110803213020!Logo_African_Pygmy_Goat.png/131px-Logo_African_Pygmy_Goat.png )

Even if I delete the file completely (I did temporarily) the archive thumbs still keep working. →Bug. Only the current version's thumbs do not work.

However, in order to assess the severity of this bug:
An "attacker" needs to know how mediawiki's thumb URLs for archive versions are constructed (those parts: archive , 20110803213020!) since the thumb URL is not anymore on the file's page (also not in file page's source code). And he needs to know the timestamp (easy to find out in the log or file page's html source). And even if a nerd did construct the correct thumb URL he can only access the thumbs which were generated before deletion. Typically this is only the 120px version which is tinytinytiny.

Conclusion:

  • Speaking of copyvios this bug is not important.
  • Speaking of hard privacy violations this bug is important - I do not know how to get rid of the old thumbs. Maybe a server admin would need to delete them manually if a important privacy violation would happen. However - this would only matter if the privacy violation is in a non-current file version. Well, this easily happens if a vandal overwrites a file (preferably a file which is in high use) with a picture of his ex-girlfriend (or whatever). If the file is reverted then it is the non-current version.
  • All in all (due to the privacy problem) I think this is a bad bug which really should be fixed.

We'll try to get this fixed before the deployment of 1.18

russnelson wrote:

An attacker could use this to redistribute an image. Upload image A.jpg. Make a thumbnail of it one pixel smaller than the original, 1023px-A.jpg. Overwrite it with a copyright violation. Record the archived thumb name. Complain about the copyright violation and ask for it to be deleted. A.jpg gets deleted, but 1023px-A.jpg remains at its archived URL, but is not in the database.

Bryan.TongMinh wrote:

LocalFile::purgeHistory() and LocalFile::purgeCache() should call purgeThumbnails() on $this->getHistory().

Created attachment 8885
Patch to purge old thumbnails from localfiles

Like this?

attachment tmp.diff ignored as obsolete

Yes, my question was just asking if such a patch would be what Bryan
was talking about. I haven't even committed it, yet.

russnelson wrote:

The patch seems to be what Bryan was suggesting.

Resolved in r94212. No need to duplicate the purging in purgeCache(), you end up calling purgeHistory() anyway :)

russnelson wrote:

Uses twill to test filerepo functions.

Attached:

russnelson wrote:

Hrm. Doesn't pass my mediawiki automated filerepo test (uploaded. Run it on a
wiki (modify the URL in the source) with your username and password on the
command line. Yes, yes, I know, but these are test wikis with a junk password.)

http://www.ra-tes.org/phase3/images/c/c6/Test-1313461285.08.png

http://www.ra-tes.org/phase3/images/thumb/c/c6/Test-1313461285.08.png/120px-Test-1313461285.08.png

http://www.ra-tes.org/phase3/images/archive/c/c6/20110816021516%21Test-1313461285.08.png

http://www.ra-tes.org/phase3/images/thumb/archive/c/c6/20110816021516%21Test-1313461285.08.png/120px-Test-1313461285.08.png

The first three 404, but the last one still works.

  • Bug 30535 has been marked as a duplicate of this bug. ***

russnelson wrote:

I've got my head down in this code anyway because of SwiftMedia. I'll see if I can find it. It's kinda in my way anyway.

russnelson wrote:

Ouch. You don't even have to delete the entire file. All you need to do is delete an archived version; the thumbnail isn't deleted.

russnelson wrote:

@Chad H noted in an email on wikitech that these files need better function names and comments. He's right. Also, as a cleanup goal, I think functions only used in the one file should be marked private, and the API functions explicitly marked public.

russnelson wrote:

There's two cases where thumbs should be deleted:

    1. when a new version of the file is uploaded (to get rid of thumbs that belong to an archived version of the file), and
  1. when a:

2a) current or
2b) archived version of a file is deleted.
Part (2b) has no code to implement it right now. It's not that it's being done incorrectly; it's not even being attempted.

I think the most appropriate locus for deleting stale thumbs is to delete them when a file is deleted. The other code which deletes current thumbs covers other cases and is still necessary.

(In reply to comment #18)

I think functions only
used in the one file should be marked private, and the API functions explicitly
marked public.

+1. I was thinking that too when I was looking at this earlier in the week.

russnelson wrote:

There's also a mix of functions dedicated to *Archive* access, and OldLocalFile, which derives from LocalFile with functions that already know that they're accessing Archived files. A cleanup would eliminate the *Archive* functions ... but that also means creating a File any time you want to do anything to a file. And means calling RepoGroup::singleton()->getLocalRepo()->newFile($title); (with an additional time parameter if it's an archive file. And that means parsing the file at the ! to get the time to pass to newFile ... the complexity keeps piling up, all to simplify? I'm not so sure, but I also don't like the lack of regularity and lack of information hiding.

russnelson wrote:

Seems to fix the bug

This patch fixes the bug, and adds some useful helper functions to filerepo/File. It also recognizes that sometimes "$suffix" is really "$archiveName" and isn't in fact optional.

I'm uploading it here rather than fixing it directly because I haven't tested it on the trunk wiki code yet. It works on both SwiftMedia (once I check in my patches), and an older slightly pre-1.17 on which I discovered the bug.

attachment b30192-fix.patch ignored as obsolete

Created attachment 9021
Whitespace issues fixed

Patch looks alright, and seems to fix the deletion of archived thumbnail images

It does however, leave the "20110411224401!Blah.png" folder, which IMHO, if we're deleting the files, hence making the directory empty, we should then delete the directory

Attached:

Patch applied with minor tweaks in r97373.

russnelson wrote:

Make that r96373.

Oh, and Roan noticed that I'd changed the API to getArchive{Url|Path}. I changed it back in r96399

(In reply to comment #25)

Make that r96373.

Oh, and Roan noticed that I'd changed the API to getArchive{Url|Path}. I
changed it back in r96399

Credit where credit is due: our automated test runner (CruiseControl) noticed this :)

Gilles raised the priority of this task from Medium to Unbreak Now!.Dec 4 2014, 10:27 AM
Gilles added a project: Multimedia.
Gilles moved this task from Untriaged to Done on the Multimedia board.
Gilles lowered the priority of this task from Unbreak Now! to Medium.Dec 4 2014, 11:20 AM
Restricted Application added subscribers: JEumerus, Matanya. · View Herald Transcript