Page MenuHomePhabricator

Problem with endashes and emdashes in article names
Closed, DeclinedPublic

Description

Author: matthiasbecker1967

Description:
Article names are getting part of the title of the HTML concerned using the name
within <title>Any article's name - Wikipedia the free encyclopedy</title>. If
the article name includes an endash or emdash the problem occurs.

The problem effects on systems from W95/98 until XP when IE 6 is used and to my
knowledge even if all the latest SPs are installed, both for IE and XP. IE 6 by
default is saving a HTML file by using the HTML page's title. While XP has no
problem with saving the dash as a dash, since ALT+0150 or ALT+0151 are valide
characters for naming files in Windows systems, such a file can't be reopened in
IE 6 without loosing styles and graphics because the browser doesn't recognise
the sub-directory in which css and pics are stored. (FYI, IE 5 would fail to
open the file in total).

The issue doesn't occur with IE 7 where certainly due to an internal
Microsoft-propriety solution the browser is capable to deal with the problem.
However, IE 7 isn't a true alternative, since it isn't available (yet) in many
languages, e.g. Czech, Slovakian, Hungarian and several other CE languages or
can't be installed on many systems used in those countries (systems older than
XP are still widespreaded in Eastern Europe).

Solution: Making sure that endash or emdash isn't used as a character in the
text string between the tags <title> and </title>, either by a manual of style
or better by a software feature. Clearly this has to override typographical
usancies, mainly in US or British English.

Note: It is believed that the left and right single quotes (but not the
abstroph) does cause the same problem. If so it should be part of the same
problem solution.


Version: unspecified
Severity: normal
OS: Windows XP
Platform: PC

Details

Reference
bz8660

Event Timeline

bzimport raised the priority of this task from to Low.Nov 21 2014, 9:31 PM
bzimport set Reference to bz8660.
bzimport added a subscriber: Unknown Object (MLST).

ayg wrote:

If this is an issue, it's probably with everything in MS's proprietary [[Windows-1252]] ranges.

Adding testme. Please test with Internet Explorer 8 and note the result here.

khriseagle wrote:

The problem as described does not occur with IE8.

manfredfr wrote:

When dashes lead to problems in old browsers - why are they used in the message files of MediaWiki? Example: &ndash is used in MessagesDe.php of Version 1.15.0 for the message 'pagetitle' (and more).

(In reply to comment #5)

When dashes lead to problems in old browsers - why are they used in the message

Because everyone thought that all the bugs in Internet Explorer were known and that it couldn't suck even more. Or just that we didn't know. The only thing to do here would be to convert longer dashes into plain dash, but I don't see that happening.

EN.WP.ST47 wrote:

No one seems really enthusiastic about fixing this, and our affected userbase is only people using IE 6 or earlier on Windows XP or earlier, which is small and dwindling, and only barely considered supported. It doesn't appear that we're violating any HTML standard by using endash or emdash, so I'm going to call this wontfix.