Page MenuHomePhabricator

Use a HTTP 301/302 status code for redirects
Closed, DuplicatePublic

Description

[14:53] <Emufarmers> Anyway, can somebody point me to a specific bug or remind me of the reasons for why we don't use 301 redirects for redirects? The best I was able to come up with was [url]https://bugzilla.wikimedia.org/show_bug.cgi?id=2585#c3[/url]
[14:54] <werdna> Emufarmers: because otherwise we can't display the "redirected from" message without fudging in a URL parameter
[14:54] <werdna> (and therefore messing up linking)
[14:54] <Emufarmers> Ah
[14:54] <Emufarmers> Is the stuff in that comment still true too?
[14:59] <werdna> looks roughly right
[15:00] <werdna> although the extra server load would be trivial once all links were canonicalised in the browser window
[15:01] <Emufarmers> But wouldn't displaying the "redirected from" still require going around the cache?
[15:02] <werdna> yes, or a URL parameter
[15:02] <werdna> could be done with cookies and XVO
[15:02] <Emufarmers> XVO?
[15:02] <werdna> X-Vary-Options
[15:04] <Emufarmers> Of course, breaking caching isn't necessarily that big of a deal on the average non-WMF wiki

Not sure whether this is worth implementing, but I see people asking for it every now and then, so I figure we should at least have a bug we can point them at.


Version: unspecified
Severity: enhancement
See Also:
https://bugzilla.wikimedia.org/show_bug.cgi?id=35045

Details

Reference
bz18883

Event Timeline

bzimport raised the priority of this task from to Lowest.Nov 21 2014, 10:43 PM
bzimport set Reference to bz18883.
bzimport added a subscriber: Unknown Object (MLST).

river wrote:

when reporting bugs, please try to describe the actual bug using words instead of making everyone work it out from an IRC log.

OK, sorry. Some people are worried about problems with Google over duplicate content, since wiki redirects don't actually change the URL (clearly this can't be that big of a deal, since Wikipedia is doing alright).

I think that bug is about interwiki redirects.

Changing the summary to include 302s, since they might be more appropriate.

Redirects may dilute pagerank slightly, but they increase relevance, because they introduce keywords from the redirect title into the URL and <title> element, both locations which Google gives a high weight. You could argue that keyword stuffing via multiple near-identical pages is an abuse of the search engines, but they don't seem to have objected so far despite the fact that we've always done this. And Wikipedia doesn't do too badly in the search engines, generally speaking.

I don't know why you think URL parameters or cookies or XVO are relevant, the only reasonable solution anyone's come up with in the past is to use the Referer header. That's not something we're interested in doing.

overlordq wrote:

*** Bug 22271 has been marked as a duplicate of this bug. ***

sorin.sbarnea wrote:

If I remember well this bug tracker is for MediaWiki software and not for Wikipedia implementation.

MediaWiki software is used on a huge number of websites and for them being search engine friendly is much harder than for Wikipedia - They cannot say that "we don't care because we have so many links to us that this will not hurt - also being a top10 website Google will assure that we are not punished"

This bug is not about asking to add 3xx redirects to Wikipedia - in fact it just asking to add support HTTP status codes to MediaWiki. This is about HTTP protocol compliance.

This should be implemented in such a way that will not change the current behavior. People interested in that should be able to add a configuration setting like $wgEnableHttpRedirects = true.

There is already patch that hacks mediawiki in doing HTTP redirect and it should be *easy* and *safe* to add this feature. Existing patch: http://www.sumbytes.com/mediawiki-301-redirects/ (this one does not use a configuration parameter)

I'm sure that if MediaWiki team is willing to accept this request, me or somebody else will take time to create a patch that is adding this functionality.

Now the real question is if there is willingness to reopen this issue and accept a patch from the community?

Re-opening this for further consideration. Comment #7 makes a reasonable point about Wikipedia and its Google ranking, namely that Google has special-cased Wikipedia in a variety of ways, making any comparisons faulty.

An option to allow for standards compliance seems reasonable, at a minimum, though changing the default behavior does as well, given that I don't see a particularly good reason to buck the (reasonable) standard.

prime38 wrote:

The option should be to break compliance, or said otherwise, the default should be for it to actually redirect and the option allows you to stop that.

MediaWiki already usees the meta link rel="canonical" tag to identify 'duplicate' pages by URL.
As far as I know Google is aware of this and makes use of that.

On the other note, it's possible to use javascript to display the redirect-from message by hash-tagging.

ie.

http://mywiki.com/This_redirect
goes to
http://mywiki.com/That_page#r=This_redirect
and javascript can grab that and display a message.

PS: I'm thinking now that hash-tagging could cause problems where a title redirects to a certain anchor on the page (ie. "This_redirect" could be a redirect to "That_page#History" in which case another hashtag would break that.

Unless there's a way to keep jump-to-anchor in tact I think we can forget my above idea..

(In reply to comment #7)

This bug is not about asking to add 3xx redirects to Wikipedia - in fact it
just asking to add support HTTP status codes to MediaWiki. This is about HTTP
protocol compliance.

(In reply to comment #8)

An option to allow for standards compliance seems reasonable, at a minimum,
though changing the default behavior does as well, given that I don't see a
particularly good reason to buck the (reasonable) standard.

(In reply to comment #9)

The option should be to break compliance, or said otherwise, the default should
be for it to actually redirect and the option allows you to stop that.

We already comply with standards. There's nothing in the HTTP specification which prohibits the sending of similar pages with different URLs. We use a different URL to support the "redirected from" link. This system is a robust, client-independent and standards-compliant method for implementing that feature.

  • Bug 28410 has been marked as a duplicate of this bug. ***

hpvpp wrote:

Ah, but computers are supposed to make life easier for people and what there is now is confusing. For example, if I want to email a redirected article, I do "<Alt>File>Send link" and get the address of the redirect page instead the address of the page I see displayed. I suggest that user-concerns should weigh more heavily than programmer-concerns.

(In reply to comment #14)

Ah, but computers are supposed to make life easier for people and what there is
now is confusing. For example, if I want to email a redirected article, I do
"<Alt>File>Send link" and get the address of the redirect page instead the
address of the page I see displayed. I suggest that user-concerns should weigh
more heavily than programmer-concerns.

Err, why does it matter which URL you send to someone? The link still reaches the content, doesn't it?

If you want to send someone the URL of the target page instead of the redirect, you should navigate to the target page yourself first (which is easy enough to do by clicking the "article" tab).

hpvpp wrote:

(In reply to comment #15)

(In reply to comment #14)

Ah, but computers are supposed to make life easier for people and what there is
now is confusing. For example, if I want to email a redirected article, I do
"<Alt>File>Send link" and get the address of the redirect page instead the
address of the page I see displayed. I suggest that user-concerns should weigh
more heavily than programmer-concerns.

Err, why does it matter which URL you send to someone? The link still reaches
the content, doesn't it?

It matters when a page is created in order to acknowledge the concept. Then while the article is not yet written, the page redirects to related concept. Imagine, if you will, somebody sending the related article to a friend who is busy and only looks at it some time later when meanwhile the redirecting page has received some proper content. The two friends might then happily discuss two very different concepts without being aware of it.

If you want to send someone the URL of the target page instead of the redirect,
you should navigate to the target page yourself first (which is easy enough to
do by clicking the "article" tab).

As I said: computers are there to make life easier for us, not the other way around.

(In reply to comment #16)

Imagine, if you will, somebody sending the related article to a friend who is
busy and only looks at it some time later when meanwhile the redirecting page
has received some proper content. The two friends might then happily discuss
two very different concepts without being aware of it.

Well, this is the nature of the beast. Content can and will change as time passes. But the answer to that is to use permalinks (referencing the numerical revision ID of the content) rather than a specific page title (redirect or not). A link to a permanent revision ID is available from the sidebar. Using it, you can be assured that you and your friend discuss the same content (barring template changes, image changes, and/or the revision being deleted, among other things).

hpvpp wrote:

(In reply to comment #17)

(In reply to comment #16)

Imagine, if you will, somebody sending the related article to a friend who is
busy and only looks at it some time later when meanwhile the redirecting page
has received some proper content. The two friends might then happily discuss
two very different concepts without being aware of it.

Well, this is the nature of the beast. Content can and will change as time
passes. But the answer to that is to use permalinks (referencing the numerical
revision ID of the content) rather than a specific page title (redirect or
not). A link to a permanent revision ID is available from the sidebar. Using
it, you can be assured that you and your friend discuss the same content
(barring template changes, image changes, and/or the revision being deleted,
among other things).

Really, well excuse me for being dense, but I was under the distinct impression that Wikipedia was to serve as (i) a general point of reference and (ii) a repository of current thinking. Using permalinks makes the information dated and potentially misleading. Furthermore, having to jump through hoops makes Wikipedia certainly less than user-friendly.

If it is too hard, say so and post appropriate warnings on the relevant pages otherwise re-open this bug and hope that one day somebody comes up with a solution.

(In reply to comment #18)

Really, well excuse me for being dense, but I was under the distinct impression
that Wikipedia was to serve as (i) a general point of reference and (ii) a
repository of current thinking. Using permalinks makes the information dated
and potentially misleading. Furthermore, having to jump through hoops makes
Wikipedia certainly less than user-friendly.

Certainly, you're excused. Before you were complaining about the content at the URL potentially changing with the passage of time. This is a real concern, as redirects are regularly expanded or have their target changed or whatever else.
Now your complaint has morphed into a concern that the content might be stale or outdated when using a permalink. This seems like a "no-win situation."

You don't have to use permalinks, but if you want a greater guarantee that the page content will be roughly similar to what it was several months ago, they're your best bet. Otherwise, you can use a redirect or a non-redirect link. Presumably if you've ended up with a redirect link in your browser URL bar, it was due to you using the redirect yourself. If it's good enough for you, it's probably good enough for your friend as well.

If it is too hard, say so and post appropriate warnings on the relevant pages
otherwise re-open this bug and hope that one day somebody comes up with a
solution.

It isn't too hard to implement 301/302s, but it comes at a cost that others have seen as too great to make it worthwhile. If the software didn't implement redirects at all, I think there would be a case for posting appropriate warnings (wherever). As it is, it seems like you're discussing a non-issue.

hpvpp wrote:

Okay(In reply to comment #19)

(In reply to comment #18)

Okay then. I try, but each time I find Wikipedia is bigger than I thought. Thanks.

(While this is a completely different solution from technical point of view, what is proposed as bug 35045 should provide most of the feature of "real" redirects without actually using them.)