Page MenuHomePhabricator

Case insensitivity in section anchors
Closed, ResolvedPublic

Description

Author: conman

Description:
Currently, sections on a MediaWiki page are associated with anchors with a name and id based on the title of the section. If there are multiple sections with the same name, then each one after the first has a number appended, so that they're unique.

However, at least on IE7 (running on Windows XP), the browser does not distinguish between two anchor names that differ only by case, whereas MW currently does distinguish, so a link with an anchor intending to link to the second section instead goes to the first one.

Example: Today, on en:w:Help Desk, there was a section titled "Adding external links", and one further down titled "Adding External Links" (or something fairly similar). MW treated the two titles as different, and so didn't add a number to the second in the anchor name, but IE7, when told to go to "#Adding External Links", went to the first one.


Version: unspecified
Severity: trivial

Details

Reference
bz10721

Related Objects

StatusSubtypeAssignedTask
DeclinedNone
ResolvedNone

Event Timeline

bzimport raised the priority of this task from to Low.Nov 21 2014, 9:53 PM
bzimport added a project: MediaWiki-Parser.
bzimport set Reference to bz10721.
bzimport added a subscriber: Unknown Object (MLST).

nicdumz wrote:

How is that simple patch ?

Index: Parser.php

  • Parser.php (révision 31892)

+++ Parser.php (copie de travail)
@@ -3494,6 +3494,10 @@

  1. Save headline for section edit hint before it's escaped $headlineHint = $safeHeadline; $safeHeadline = Sanitizer::escapeId( $safeHeadline );

+ # lowercase headlines, since some browser don't distinguish
+ # "Anchor" from "anchor" (bug #10721)
+ $safeHeadline = strtolower($safeHeadline)
+

			$refers[$headlineCount] = $safeHeadline;

			# count how many in assoc. array so we can track dupes in anchors

Cheers,
Nicolas.

nicdumz wrote:

proposed patch

I obviously didn't think enough when posting my previous comment. (!) Lowercase safeHeadline would eventually imply a lowercase anchor, meaning that a link to section "John Doe" would be #john_doe. Quite annoying, in fact.

Attached a better patch, where only the key to the associative array is lowercase.

Found a strange array entry which is apparently never accessed. Could someone check this ?

Attached:

Note, this lovely feature appears to be unique to Internet Explorer. :) Firefox, Opera, and Safari all consider the differently-cased anchors to be separate and distinct.

It is however alleged that XHTML rules require the disambiguation anyway.

Added patch (with code fix) and parser test case in r31931... thanks for the patch!

happy.melon.wiki wrote:

Cf comments in r74533.