Page MenuHomePhabricator

Auto-redirect when URLs have trailing punctuation added/removed by auto-linkers
Open, LowestPublicFeature

Description

Author: morfusmax

Description:
The page http://en.wikipedia.org/wiki/Dustin_Diamond has a link near the end of the filmography for Hamlet A.D.D. When this link is clicked it takes the browser to the right page. However, when the resulting link in the address bar -- http://en.wikipedia.org/wiki/Hamlet_A.D.D. -- is used as a link externally, e.g. gmail or chat, the result is, "Wikipedia does not have an article with this exact name."


Version: 1.18.x
Severity: enhancement
URL: http://en.wikipedia.org/wiki/Hamlet_A.D.D.

Details

Reference
bz26556

Event Timeline

bzimport raised the priority of this task from to Medium.Nov 21 2014, 11:24 PM
bzimport set Reference to bz26556.
bzimport added a subscriber: Unknown Object (MLST).

ayg wrote:

Because the auto-linking feature of whatever software you're using strips the trailing period. What are we supposed to do here, automatically redirect to a page with a trailing period? That's probably a good idea. We also might want to handle the opposite case, where a question mark or bracket or something is added to the URL by the auto-linking software.

ayg wrote:

(Note that you can work around the specific case by just manually creating [[Hamlet A.D.D]] and redirecting it to [[Hamlet A.D.D.]].)

(In reply to comment #1)

What are we supposed to do here, automatically redirect to a
page with a trailing period? That's probably a good idea. We also might want
to handle the opposite case, where a question mark or bracket or something is
added to the URL by the auto-linking software.

You're joking, right?

ayg wrote:

No, I'm not. If the user goes to a page that doesn't exist, check for similar-looking page names, and if you find a plausible match do a redirect. You'd probably want to make it like a normal redirect, not an HTTP redirect, because otherwise you'd have trouble creating the page if you want it to contain something different.

What's wrong with this approach? Specifically, given that there's no way to stop these bogus autolinks from existing, do you have a solution with a better cost/benefit ratio? On a website I run, I noticed a lot of 404s were due to bogus trailing punctuation (the opposite problem to the one seen here).

Might make sense as an extension, perhaps combined with more general autoredirection, e.g. from titles with different letter case. (That could mostly solve bug 453.)

Aklapper lowered the priority of this task from Medium to Lowest.Mar 25 2015, 4:00 PM
Aklapper subscribed.
Aklapper changed the subtype of this task from "Task" to "Feature Request".Feb 4 2022, 11:02 AM

This general problem affects me almost daily.
It is usually from links within emails, especially when the linked page has Section Headers that include a ? (question mark).
E.g. https://wikitech.wikimedia.org/wiki/Wikitech:Cloud_Services_Terms_of_use#What_uses_of_Cloud_Services_do_we_not_like? (which even breaks in phabricator)

I have some old (2007!) notes on the problem at https://en.wikipedia.org/wiki/User:Quiddity/Problems_with_punctuation_in_links

Painful workaround: I continuously have to remember to manually replace ? with %3F every time I write an email, or send a link off-wiki in another way.

Also: I think this task might be a duplicate-with T40265: Links in emails: Parentheses ( or ) not recognized as part of URL - brackets should be URL-encoded - But I'm not sure which direction to merge (and any title/description overhauls to make the problem clearer).