Author: vigna
Description:
Some Wikipedia pages contain slashes in their title, e.g., "/b/" or "/dev/null". Such slashes should be escaped to %2F when they are not meant to separate paths components, as per RFC 1738/3986. Instead, for instance,
http://en.wikipedia.org/w/index.php?title=B/&redirect=no
contains an A element
<a href="/wiki//b/" title="/b/">/b/</a>
that should written
<a href="/wiki/%2Fb%2F" title="/b/">/b/</a>
This causes 2 problems:
- The URL above does not conform to the RFCs. This is a minor issue but it might rise other problems in the future.
- If the slash is initial, the URI is not normalized as per RFC. E.g., in Java
bsh % print(URI.create("wiki//b/").normalize());
wiki/b/
This might be a problem with crawlers, which usually normalize URLs to reduce the number of duplicates.
Note that %2F-escaped slashes work perfectly, and
http://en.wikipedia.org/wiki/%2Fb%2F
brings you exactly to /b/'s page. Bug 8254 seems to say the opposite, but it's a few years old.
Version: 1.23.0
Severity: normal