Page MenuHomePhabricator

IIS7.5 is mishandling redirects generated by OutputPage::output() when the URL contains special characters
Closed, DeclinedPublic

Description

Upstream report: http://forums.iis.net/p/1165517/1936321.aspx#1936321

Author: lhridley

Description:
This issue was originally reported on the MediaWiki Users forums in this thread:

http://www.mwusers.com/forums/showthread.php?14312-MediaWiki-and-IIS7-URL-bug

The original reporter tested this on MW version 1.13.2 and 1.15.1, using PHP version 5.2.9 using IIS7.5 on Microsoft Server 2008. I have duplicated this behavior on a VMWare virtual machine running Windows 7 Ultimate, using IIS 7.5 and PHP version 5.2.12 with MW version 1.15.1 and 1.16alpha.

The test urls used to generate this error are:

http://localhost/w/index.php?title=á:á
http://localhost/w/index.php?title=á:Test

The issue in question is the presence of special characters in the title name when a colon is also present and there are special characters preceding the colon.

When MediaWiki encodes the title and sends redirect headers in OutputPage::output(); IIS7.5 is returning a 404.0 error, and IIS7.5 indicates that the requested URL is in the format:

http://localhost:80/w/index.xn--php?title=-14a:%C3%A1 (first url above)
http://localhost:80/w/index.xn--php?title=-14a:Test (second url above)

So, at some point it appears that IIS7.5 is rewriting the URL redirect.

Setting $wgDebugRedirects = true; circumvents this error by generating a link to the redirect page instead.

This may in fact be a IIS7.5 bug, but at this point Microsoft's stand is that this is an application error.

This issue was found and tested against IIS7.5 but may in fact have been introduced in IIS7.

This issue is browser independent; it has been recreated in IE8, IE7 and Firefox 3.6.


Version: 1.16.x
Severity: major
OS: Windows 7

Details

Reference
bz22709

Event Timeline

bzimport raised the priority of this task from to Low.Nov 21 2014, 10:58 PM
bzimport set Reference to bz22709.
bzimport added a subscriber: Unknown Object (MLST).

lhridley wrote:

Update:

Using a URL redirect where the colon is encoded as well is accepted by IIS7.5 and redirects to the appropriate page within MediaWiki.

For example, changing wfUrlencode() so that the colon symbol is excluded from the call to str_ireplace() in wfUrlencode() will result in an encoded URL:

http://localhost/w/index.php?title=%C3%81%3A%C3%A1

which is accepted by IIS7.5 and redirected properly rather than generating a 404.0 error; the resulting MediaWiki page that is presented to the user has the correctly formatted title; however the parameters in the url are fully encoded.

lhridley wrote:

A possible fix for this issue is as follows:

Change the global function wfUrlencode() to the following:

function wfUrlencode( $s ) {

$s = urlencode( $s );
    $s = str_ireplace(
    array( '%3B','%3A','%40','%24','%21','%2A','%28','%29','%2C','%2F' ),
    array(   ';',  ':',  '@',  '$',  '!',  '*',  '(',  ')',  ',',  '/' ),
    $s
);

## check to see if server is running Microsoft IIS 7 or greater; if so converts colons back to url encoded values
if(isset($_SERVER['SERVER_SOFTWARE'])) {
    $server = explode("/", $_SERVER['SERVER_SOFTWARE']);
    if($server[0] == "Microsoft-IIS" && ($server[1]=='7'||$server[1]=='7.5')) {
        if(!(strpos($s, '%')===false)) {
            $s = str_ireplace(array(':'), array('%3A'), $s);
        }
    }
}
return $s;

}

I have tested this a preliminarily it appears to hae some promise. I want to make sure that IIS always sets a value for $_SERVER['SERVER_SOFTWARE'] before creating a patch and/or modifying trunk.

Another option would be to exclude colons from the str_ireplace() process in wfUrlencode() for all browsers; however, this would result in an encoded character for the colon in every instance, which is probably not a desired behavior since this really only comes into play when there are special characters in the title prefix.

lhridley wrote:

Revision to the fix above:

function wfUrlencode( $s ) {
$s = urlencode( $s );

		$s = str_ireplace(
		array( '%3B','%3A','%40','%24','%21','%2A','%28','%29','%2C','%2F' ),
		array(   ';',  ':',  '@',  '$',  '!',  '*',  '(',  ')',  ',',  '/' ),
		$s

);

  1. check to see if server is running Microsoft IIS 7 or greater; if so converts colons back to url encoded values

if(isset($_SERVER['SERVER_SOFTWARE'])) {

		$match = preg_match('|(Microsoft-IIS/7)|', $_SERVER['SERVER_SOFTWARE'], $a);
		if($match > 0) {
	            if(!(strpos($s, '%')===false)) {
	                $s = str_ireplace(array(':'), array('%3A'), $s);
			}
		}

}
return $s;
}

Does it also fail when the domain name has a dot?

From earlier reports it seems that IIS gets confused when the domain does not contain a dot (eg. localhost) and there's a colon in the url.

lhridley wrote:

Yes. I've tested this using http://localhost and using http://www.example.com (setting the domain name in the hosts file).

I don't have an internet accessible server with Windows so I haven't tested this on an actual web-accessible server, just in a local test environment.

lhridley wrote:

Just got a report back from the user who initially reported this problem with IIS7; he has indicated that the fix posted above is working on his installation. His wiki is on a company intranet so there is no publicly available link.

Fixed with a different implementation on r63228.

lhridley wrote:

Microsoft has confirmed that this is a bug in IIS 7.5

http://forums.iis.net/p/1165517/1936321.aspx#1936321

I am reopening this bug. The fix that was implemented in r63228 encodes all occurrences of the colon in the title parameter of a url.

This is not needed except in cases where there are certain special characters preceding the URL. However since it appears to be picking the characters at random to apply what looks like punycoding to the URL, I am in favor of encoding the colon when there is any special character preceding the colon. But encoding all of them may be a bit of overkill.

Does it only fail if the special characters are /before/ the colon?

Also, do you have any idea on what it considers 'special'? (we should probably go ASCII-7)

+upstream
this is a issue with iis according to that forum report, although we can still patch around it.

lhridley wrote:

Yes. It only fails if the special character falls before the colon. URL parameter strings with no colon, or with special characters only after the colon appear to be fine.

This appears to occur primarily with lower case latin extended character, lower case cyrillic and some lower case greek characters (at least that's what I've identified so far).

lhridley wrote:

Yes. It only fails if the special character falls before the colon. URL parameter strings with no colon, or with special characters only after the colon appear to be fine.

This appears to occur primarily with lower case latin extended character, lower case cyrillic and some lower case greek characters (at least that's what I've identified so far).

Bryan.TongMinh wrote:

If I understand correctly, we only need to encode the colon if a special character is present? A quick fix would be /[A-Za-z0-9]*/.

12 years later, assuming that this is obsolete. If this still happens with supported software versions of the involved software pieces, please file a new ticket. Thanks!