Page MenuHomePhabricator

When redirecting to a section, set the page name in the URL correctly
Closed, ResolvedPublic

Description

Author: zeratul976

Description:
Example: Search Wikipedia for "Malia Obama". You get taken to the "Malia Obama and Sasha Obama" section of the page "Family of Barack Obama". However, the URL is "http://en.wikipedia.org/wiki/Malia_Obama#Malia_and_Sasha_Obama".

This is misleading because it might lead you to believe that you are on a page called "Malia Obama" which has a subsection named "Malia and Sasha Obama".

Worse, suppose you want to see what the title of the article to which you have been redirected is:

  • it's not in the URL
  • it's often hard to see it in the tab if you have many tabs open (and modern browsers have done away with the title bar)
  • so now you're left with scrolling up to the top of the page just to find out what its title is

To fix this, the URL should be rewritten to contain the name of the article to which you have been redirected, e.g. "http://en.wikipedia.org/wiki/Family_of_Barack_Obama#Malia_and_Sasha_Obama".


Version: 1.22.0
Severity: enhancement

Details

Reference
bz48028

Event Timeline

bzimport raised the priority of this task from to Low.Nov 22 2014, 1:34 AM
bzimport set Reference to bz48028.
bzimport added a subscriber: Unknown Object (MLST).

mr.heat wrote:

From what I hear a new search is in the making.[1] Maybe this feature request should be assigned to the CirrusSearch team? It should be a breeze to implement:

  1. User enters a search term.
  2. Special:Search checks if an article with this name exists.
  3. [New] Check if it's a redirect. If it is do a 301 to the redirect target.
  4. Else do a 301 to the article.

I must admit I'm not sure what happens to #anchors if you do this. Personally I consider this less important than a clean URL. A possible solution to this is to skip step 3 if the redirect contains an #anchor.

[1]https://www.mediawiki.org/wiki/Extension:CirrusSearch

(In reply to comment #1)

From what I hear a new search is in the making.[1] Maybe this feature request
should be assigned to the CirrusSearch team? It should be a breeze to
implement:

The relevant code is actually in core MediaWiki, in SpecialSearch::goResult(). It is not specific to MWSearch/LuceneSearch or CirrusSearch.

However, I don't see this as a search problem. Rather, MediaWiki should take advantage of recent browser features to change the URL when possible.

As stated in bug 39328:

Instead of using location.hash we should consider using
window.history.replaceState and element.scrollIntoView if they are both
available.

mr.heat wrote:

(In reply to comment #2)

Instead of using location.hash we should consider using
window.history.replaceState and element.scrollIntoView if they are both
available.

From what I understand this bug is not about the hash but about the page name before the hash.

[[Malia Obama]] is a redirect to [[Family of Barack Obama#Malia and Sasha Obama]]. Going to [[Malia Obama]] results in the URL:

http://en.wikipedia.org/wiki/Malia_Obama#Malia_and_Sasha_Obama

Instead it should be:

http://en.wikipedia.org/wiki/Family_of_Barack_Obama#Malia_and_Sasha_Obama

The hash is right. What's wrong is the page name before the hash.

(In reply to comment #3)

(In reply to comment #2)

Instead of using location.hash we should consider using
window.history.replaceState and element.scrollIntoView if they are both
available.

[...]

The hash is right. What's wrong is the page name before the hash.

Correct; that's why I didn't just mark this as a duplicate of bug 39328. [history.replaceState][1] isn't limited to changing the hash; it should suffice for this purpose.

[1]: https://developer.mozilla.org/en-US/docs/Web/Guide/API/DOM/Manipulating_the_browser_history

mr.heat wrote:

(In reply to comment #4)

[history.replaceState][1] isn't limited to changing the hash; it should
suffice for this purpose.

I'm still not sure if we are talking about the same thing. A proper redirect must send a 301 HTTP response to search engines, crawlers, bots and all other user agents. I know Google parses JavaScript but I'm sure most other user agents don't. This makes replaceState more a cosmetically disguise, not a real solution for the problem.

(In reply to comment #5)

[...] This makes replaceState more a cosmetically disguise, not a
real
solution for the problem.

The "Redirected from" link is a basic feature that should be usable even when JavaScript is disabled.

If we were to use a 301 redirect, as explained in bug 18883, "fudging in" a URL parameter would be necessary, and it would create even uglier URLs that are somewhat harder to cache.

For example:

http://en.wikipedia.org/wiki/Malia_Obama#Malia_and_Sasha_Obama

versus

http://en.wikipedia.org/wiki/Barack_Obama?r=Malia_Obama#Malia_and_Sasha_Obama

In either case, cleaning up the URL would require replaceState.

(In reply to comment #5)

(In reply to comment #4)

[history.replaceState][1] isn't limited to changing the hash; it should
suffice for this purpose.

I'm still not sure if we are talking about the same thing. A proper redirect
must send a 301 HTTP response to search engines, crawlers, bots and all other
user agents. I know Google parses JavaScript but I'm sure most other user
agents don't. This makes replaceState more a cosmetically disguise, not a
real
solution for the problem.

Google and other crawlers are irrelevant, that's what rel=canonical is for.

mr.heat wrote:

(In reply to comment #7)

Google and other crawlers are irrelevant, that's what rel=canonical is for.

Maybe my comment was confusing. I used crawlers as an example when referring to "all user agents". This includes browsers with JavaScript disabled and every GET request from every external client. Nothing is more compatible than a plain old 301. Both rel=canonical and history.replaceState are workarounds for problems caused by not doing a real redirect.

Real redirects have the potential to actually save bandwidth. Currently MediaWiki delivers the same duplicate content multiple times even if the user agent is not interested in getting duplicate content. If a rel=canonical is present the content is wasted.

Real redirects are less confusing, as explained above.

(In reply to comment #6)

The "Redirected from" link is a basic feature that should be usable even when
JavaScript is disabled.

The feature doesn't have much use. As long as a redirect makes sense (which it does in almost all cases, the communities care very much about such details) we don't have to repeat the fact that a redirect happened. Click [[Tan An]] and be redirected to a page or section named "Tân An". This is obvious.

The only use is to be able to go back to the redirect. There are other ways to do this without relying on JavaScript, e.g. using the action=info link in the navigation on the left or adding a list of redirects to the bottom of the edit interface similar to the list of templates. Users with JavaScript enabled can get the old link on the top. It's a rare use case anyway. I don't think it's a problem to make the "redirected from" hint a JavaScript feature as long as it's not the only way.

On the other side I consider the confusion and compatibility problems caused by not doing a real redirect more important.

(In reply to comment #8)

The only use is to be able to go back to the redirect. There are other ways
to
do this without relying on JavaScript, e.g. using the action=info link in the
navigation on the left or adding a list of redirects to the bottom of the
edit
interface similar to the list of templates. Users with JavaScript enabled can
get the old link on the top. It's a rare use case anyway. I don't think it's
a
problem to make the "redirected from" hint a JavaScript feature as long as
it's
not the only way.

I don't consider a single one of those to be an "alternative way" to get back to the redirect page you were just on. Tân An may be relatively reasonable to find the redirect you came from. But in reality the list of redirects for a page may look like these:
https://en.wikipedia.org/w/index.php?title=Special:WhatLinksHere/Tilde&hidelinks=1&hidetrans=1 (32)
https://en.wikipedia.org/w/index.php?title=Special:WhatLinksHere/Taa_language&hidelinks=1&hidetrans=1 (38)
https://en.wikipedia.org/w/index.php?title=Special:WhatLinksHere/List_of_The_Simpsons_episodes&hidelinks=1&hidetrans=1 (39)
https://en.wikipedia.org/w/index.php?title=Special:WhatLinksHere/Cascading_Style_Sheets&hidelinks=1&hidetrans=1 (40)
https://en.wikipedia.org/w/index.php?title=Special:WhatLinksHere/List_of_Doctor_Who_serials&hidelinks=1&hidetrans=1 (51)
https://en.wikipedia.org/w/index.php?title=Special:WhatLinksHere/List_of_The_Angry_Beavers_episodes&hidelinks=1&hidetrans=1 (68)
https://en.wikipedia.org/w/index.php?title=Special:WhatLinksHere/List_of_Yu-Gi-Oh!_characters&hidelinks=1&hidetrans=1 (312)

Sifting through a list of over a dozen – and in the worst case, hundreds – of titles to find the one you came from is not an "alternative".

(In reply to comment #8)

Users with JavaScript enabled can
get the old link on the top. It's a rare use case anyway. I don't think it's
a
problem to make the "redirected from" hint a JavaScript feature as long as
it's
not the only way.

Also on this whole notion of using JavaScript... how?

I haven't seen a single suggestion of how redirected from could be implemented in JavaScript in a way that wasn't unacceptably unstable creating a regression in the feature.

  • Cookies are not isolated to the tab so they leak in unacceptable ways to other pages creating bogus "redirected from" links.
  • sessionStorage only works if navigation was done by a link, search, etc... on-wiki and breaks the feature for incoming external links, interwiki links, direct navigation, and on-wiki links created by features not integrated into the sessionStorage handling.
  • Intermediate meta and javascript redirects create history bugs and naturally are not 301 redirects.

Change 143852 had a related patch set uploaded by Bartosz Dziewoński:
On redirect, actually rewrite page URL to the target one with JavaScript

https://gerrit.wikimedia.org/r/143852

  • This bug has been marked as a duplicate of bug 35045 ***