Page MenuHomePhabricator

patch to let mediawiki display the title lowercase in wgCapitalLinks mode
Closed, DeclinedPublic

Description

In the mediawiki-mode $wgCapitalLinks=true all titles are capital. In some cases
this is not desired and in the german Wikipedia in some articles there are hints
like "The title of this article is not correct, the correct spelling is: ..."
for that reason.

With this little patch you can let mediawiki display the title lowercase by
inserting "LCFIRST" into the article text. The title isn't changed and
remains capital in things like History, Contributions and Allpages (for real
lowercase titles the mode $wgCapitalLinks=false exists), but in normal view-mode
it's displayed lower case.

The patch was developed by [[:de:Benutzer:Ncnever]] and [[:de:Benutzer:APPER]].


Version: 1.5.x
Severity: enhancement

Details

Reference
bz2118

Revisions and Commits

Event Timeline

bzimport raised the priority of this task from to Medium.Nov 21 2014, 8:30 PM
bzimport added a project: MediaWiki-Parser.
bzimport set Reference to bz2118.
bzimport added a subscriber: Unknown Object (MLST).

Created attachment 506
lcfirst-patch

attachment lcfirst.patch ignored as obsolete

rowan.collins wrote:

Note: there is a similar, but more general, proposal at bug 496 - lower-case
first letters are the simplest case, because as long $wgCapitalLinks=true either
capitalisation can be used in links. There are, however, other reasons the
correct title can't be used (e.g. the programming languages "C++" and "C#")
which need more care to make sure people know how to link to the article.

This seems a neat patch for those easy cases though - although it should check
for $wgCapitalLinks=false at some point (e.g. "if( $mw->matchAndRemove( $text )
&& $wgCapitalLinks") ..."), else there could be pages where the displayed title
*didn't* match the text needed for links. [Obviously, it would be silly to use
LCFIRST with that option set, but the user might not realise that]

Created attachment 519
lcfirst patch

changed - it now doesn't work for mode $wgCapitalLinks = false as it should

A whole mode [[Title:C++]] and so on really would be better...

attachment lcfirst.patch ignored as obsolete

Patch has errors - Title isn't changed if it comes from cache

avarab wrote:

Just because the patch has problems it doesn't make the bug INVALID, REOPENING.

Created attachment 521
Patch for new function "#title RealTitle"

I changed the thing completely. In bug 496 is mentioned that there are a lot of
more problems - not only the first letter lowercase. So I developed a patch for
the following syntax:

#title RealTitle

In this case all behind "#title " to the end of the line is the new "RealTitle"
(like "C++"). The original article title doesn't change but in mode action=view
the RealTitle is shown instead of the original title. In the subtitle of the
page there is the hint "Original title: Title".

So for example in http://en.wikipedia.org/wiki/C_Plus_Plus - you have to add a
new line "#TITLE C++" where you want. The displayed title in action=view is
then "C++" with the subtitle "Original title: C Plus Plus".

Note that this is only a viewing thing. You can't link [[C++]]. Therefore you
would have to redesign something in the database tables. So it's nothing else
than the hint "wrongtitle" build in more prettily.

(sorry for bad english)

attachment realtitle.patch ignored as obsolete

Titles are *supposed* to be exactly what you use to link with. That's a deliberate
thing, and breaking that is probably not going to be accepted.

Could you please return this to the original patch?

Created attachment 528
lcfirst-patch

Okay - it's true, that the whole title rewrite is not good ;).

So here is the patch for LCFIRST again but now also working when the page
comes from cache.

attachment lcfirst.patch ignored as obsolete

Overall, looks pretty clean and covers the common 'iPod, iMac, eBay' type cases in a way that's simple
and doesn't conflict with title discovery.

A couple relatively minor issues that might be considered:

For pages in a namespace, this lowercases the initial letter in the namespace (eg 'talk:IPod').

Unlike TOC, NOTOC, and language/category links, it doesn't take effect in the edit preview.

It takes effect on old page views, but not on diff displays where we also show a full rendered
revision.

Created attachment 540
lcfirst patch

I changed this patch, so now it works on every namespace and for preview.

I changed the way it works, so now the Title object is used, as it should (when
working exact) - and that was the solution for namespaces/preview.

For diff it's not possible, because the title is created by the
DifferenceEngine and not always the current version is shown (e.g. when showing
diff on preview), so DifferenceEngine and Parser are completely devided and its
not possible to use the parsed article to change the title without dirty hacks
or a lot of rewriting (e.g. there could be two titles displayed on diff, when
two different articles are compared...).

attachment lcfirst.patch ignored as obsolete

Created attachment 541
lcfirst patch

Again a revision because of caching.

I had to add a boolean to the ParserOutput object, so when a page comes from
cache the output could make it lowercase if this boolean is true (the title
isn't cached).

Attached:

avarab wrote:

It doesn't show the title lower cased when viewing the history page or the
associative talk page, that's pretty much a non-issue however since meant to be
an aesthetic change.

Commited to HEAD.

I've reverted the commit. Title objects should be treated as value objects, so any Title object for a given title will always
return the same results. Poking around with the internals of one sounds like trouble.

rowan.collins wrote:

(In reply to comment #7)

Titles are *supposed* to be exactly what you use to link with. That's a

deliberate

thing, and breaking that is probably not going to be accepted.

What's more, any patch to do that should be discussed on the bug discussing
that: bug 496, which already includes discussion of how to aid "title discovery"
as Brion so aptly puts it.

gangleri wrote:

Hallo!

Regarding the comments posted until now: Please provide some permanent links
regarding tests with the patches.

I want to mention here a solution which would allow to generate pages that look
like pages starting with lower case.

a) Assume that {{lowercase}} is a protected page contaning only the
Unicode Character ZERO WIDTH SPACE - U+200B
http://www.fileformat.info/info/unicode/char/200b/index.htm
HTML Entity (decimal) ​ (hex) ​
UTF-8 (hex) 0xE2 0x80 0x8B (e2808b) %E2%80%8B %e2%80%8b
Character.getDirectionality() DIRECTIONALITY_BOUNDARY_NEUTRAL [9]

see [[wikipedia:yi:template:lowercase]]

b) Assume that it will be allowed to use *one* ZERO WIDTH SPACE as first
character in the main namespace and in the talk namespace.

see bug 337 comment 22

Then there would be a way to generate pagenames witch "displays" with a starting
lower case character.

see [[wikipedia:yi:category:Bugzilla/00496]] and
http://yi.wikipedia.org/w/index.php?title=project:Bugzilla/00496&oldid=7862

c) This method does not cover all requirements from bug 496 :
It is not posible to generate titles starting with "#", more consecutives "/"
(unless there are no restrictions regarding how many ZERO WIDTH SPACE's should
be allowed in the title), other restricted characters

more examples are available at http://test.leuksman.com/view/Category:Bugzilla/00337

best regards reinhardt [[user:gangleri]]

dbenbenn wrote:

This seems to me like the wrong way to solve the problem. Wiki code in the page
text, morally speaking, shouldn't affect the page title. Here is, I think, a
better solution (though without a patch):

Add a new database field. In addition to the canonical page title, add a
"display page title". The display page title would be required to have the
property that, when canonicalized, it becomes the canonical page title. For
example, you could have

Canonical     Display
EBay          eBay
FILE ID.DIZ   FILE_ID.DIZ

Naturally, the idea would be to simply use the display page title as the title
at the top of the article. The canonical page title would still be used for
everything else, for example to link to the page.

Finally, it would be necessary to add some support to Special:Movepage. I think
a check box, like "Allow underscores and a lowercase initial in this title",
that would cause the typed title to be stored literally as the display page
title. Then if you wanted to make an article title lowercase, you would
naturally use the "move" tab, lowercase the initial in the field, and check the
box. Note that "moving" EBay to eBay wouldn't actually move the page; it would
''only'' change the display title field.

Doesn't this seem like a more natural solution? (It also fixes the _ problem.)

dbenbenn wrote:

Another example this idea could fix would be [[:Of The Wand & The Moon:]], which
should display as

:Of The Wand & The Moon:

gangleri wrote:

(In reply to comment #17)

Another example this idea could fix would be [[:Of The Wand & The Moon:]], which
should display as

:Of The Wand & The Moon:

correct link is
http://en.wikipedia.org/wiki/Of_The_Wand_%26_The_Moon:

hack is
http://en.wikipedia.org/wiki/%E2%80%8B:Of_The_Wand_%26_The_Moon:

gangleri wrote:

Oops!(In reply to comment #16)

Canonical     Display
EBay          eBay
FILE ID.DIZ   FILE_ID.DIZ

Is a much better way. Beside including other reserved / invalid characters in
page titles it would allow also to use "#" in page titles which is not possible
with the hack from comment 15.

best regards reinhardt [[user:gangleri]]

That would break title identity, which is an important part of the wiki system.

gangleri wrote:

*note*

The implementation of this bug would help to render some BiDi titles in an
unified fashion.
remember typical examples:
"foo (bar)" will render as "(foo (bar" at RTL wikis
"פֿאָ (באַר)‏" will render as "פֿאָ (באַר)" at LTR wikis

Please note also that the category looking optically as
"commons:‎ ​‏"באַניצער
is using a trailing RLM ‏
http://yi.wiktionary.org/wiki/category:%D7%91%D7%90%D6%B7%D7%A0%D7%99%D7%A6%D7%A2%D7%A8_commons:%E2%80%8E
and
"meta:‎ ​‏"באַניצער
is using a trailing RLM ‏
http://yi.wiktionary.org/wiki/category:%D7%91%D7%90%D6%B7%D7%A0%D7%99%D7%A6%D7%A2%D7%A8_meta:%E2%80%8E

Please take a look at the four links to user pages from
[[wikt:yi:project:bugzilla/02118]] and follow these links. Please pay attention
to how the "user:foo" / "באַניצער:foo" links render in the Babel templates at
these user pages.

Basically there are two types of templates used

  • templates with user links to LTR wikis as [[wikt:yi:template:user_meta]]
  • templates with user links to RTL wikis as [[wikt:yi:template:user_wikipedia]]

*how does this relate to this bug*
If bug 2118 would be implemented

  • the categories would not need the trailing RLM
  • the user names would not need the trailing RLM;
    • last requirement is relevant for single login (bug 57)

best regards reinhardt [[user:gangleri]]

gangleri wrote:

(In reply to comment #18)

(In reply to comment #17)

Another example this idea could fix would be [[:Of The Wand & The Moon:]], which
should display as

:Of The Wand & The Moon:

Escaping the trailing ":" with "%3A" (bug 4581):

correct link is
http://en.wikipedia.org/wiki/Of_The_Wand_%26_The_Moon%3A

hack is
http://en.wikipedia.org/wiki/%E2%80%8B:Of_The_Wand_%26_The_Moon%3A

dbenbenn wrote:

(In reply to comment #20)

That would break title identity, which is an important part of the wiki system.

Are you referring to my proposal to add a new "display page title" field?
In what way would it break title identity? Note that the following requirement
would be enforced: the canonicalization of the display page title would have to
yield the actual page title. (In other words, putting the display title in
[[square brackets]] gives the same link as using the real title.) So "eBay" for
"EBay", "FILE ID.DIZ" for "FILE_ID.DIZ", ":Of The Wand & The Moon:" for
"Of The Wand & The Moon:"

(In reply to comment #19)

Is a much better way. Beside including other reserved / invalid characters in
page titles it would allow also to use "#" in page titles which is not possible
with the hack from comment 15.

Possibly. It would help for [[Family Guy Viewer Mail]], which could have the
display title [[Family Guy Viewer Mail #1]]. But it wouldn't help for [[C Sharp]].
For that article, you'd want the display title to be [[C#]], but that title
canonicalizes
to [[C]].

Wiki.Melancholie wrote:

As long as there is no PHP solution for this, a little JavaScript (Monobook.js)
could simulate the same effect! See [[als:openURL]] and [[als:sed]] for example.

  • Best regards, ~~~~

Wiki.Melancholie wrote:

Just for your information: The JavaScript from comment #24 has been optimised a
bit. To see how smoothly it works now, see [[:als:openURL]].

robchur wrote:

Use {{DISPLAYTITLE}}.

Diffusion changed the task status from Declined to Resolved by committing Unknown Object (Diffusion Commit).Mar 4 2015, 8:25 AM
Diffusion added a commit: Unknown Object (Diffusion Commit).
demon changed the task status from Resolved to Declined.Mar 4 2015, 8:39 AM
demon claimed this task.