Page MenuHomePhabricator

Spaces in upload pathname breaks images (improper urlencoding)
Closed, ResolvedPublic

Description

Author: ziba

Description:
I've found in mediawikis 1.9.3 and 1.10.0 that images do not display if there is a space in the in the url before the arguments.

This is caused by the use of wfUrlencode in includes/Image.php:function imageUrl(). The urlencode function is only for the "query part of a URL" - php.net. Because the whole image url is passed through urlencode, the spaces in the base url get changed to +. This breaks the display of and link to every image.

I recommend ending imageUrl() with:
return $url;
rather than
return wfUrlencode( $url );


Version: unspecified
Severity: major

Details

Reference
bz10787

Event Timeline

bzimport raised the priority of this task from to Low.Nov 21 2014, 9:50 PM
bzimport set Reference to bz10787.
bzimport added a subscriber: Unknown Object (MLST).

That can't happen, unless you broke something badly I think. :)

Provide exact steps to reproduce the problem if you're really sure about this.

ziba wrote:

Not that I never break things badly, but this one isn't my fault!

I have verified this on a completely fresh install:

Steps:
$cd /usr/local/apache1/htdocs/
$wget http://download.wikimedia.org/mediawiki/1.10/mediawiki-1.10.1.tar.gz
$mv mediawiki-1.10.1 mediawiki\ 10 # to get the space in the url
$cd mediawiki\ 10/
$chmod a+w config/

Followed web based install process

mv config/LocalSettings.php .

  1. Edited LocalSettings.php only to enable file uploads
  2. Created a user account on my new wiki
  3. Uploaded a simple image (Firefox.png)

Result:
The image does not show because this is the url mediawiki tries to use:
http://localhost/mediawiki+10/images/b/bf/Firefox.png

Notice the plus sign between mediawiki and 10.
This is because it was run through the PHP function "urlencode".
Regardless of what the name "urlencode" may imply, it is NEVER appropriate to
run a whole url through urlencode. Urlencode is ONLY for the "query part" that
come after a question mark in a URL.
(http://us2.php.net/manual/en/function.urlencode.php)

Thank you for your time in looking into this.

Changed summary to describe the issue more accurately.

The good news is that the specific case of the image pathnames is fixed in 1.11.

The bad news is that directories with spaces in your URL path look like they're pretty badly handled all around. :)

Most bits assume the path is already properly formatted as a URL path component and output the space directly -- this is invalid and may cause various problems:

  • Some browsers won't understand the whole URL
  • Style sheets and other components may not work correctly
  • HTTP redirects may be similarly unreliable
  • etc

Some of the image code in 1.10 and below does do extra escaping on the path, further using urlencode() where it should use rawurlencode() -- as that would properly escape spaces for use in paths. However since $wgScriptPath (and the usually derived $wgUploadPath) are expected to be pre-escaped, they should have the %20 in them to start with.

The path detection via SCRIPT_NAME or PHP_SELF seem to give de-escaped paths in my quick testing on Apache 2 and Lighttpd 1.4; if that's consistent it _should_ be safe to re-escape the path components in the detection in the installer.

Bryan.TongMinh wrote:

This should all be fixed by now. Please reopen with specific examples if it still occurs.