Parser doesn't support protocol relative external links of type [//example.com]. Adding a new bug because 20342 is too vague on the specific issue it tries to address.
Version: 1.20.x
Severity: normal
Parser doesn't support protocol relative external links of type [//example.com]. Adding a new bug because 20342 is too vague on the specific issue it tries to address.
Version: 1.20.x
Severity: normal
Status | Subtype | Assigned | Task | ||
---|---|---|---|---|---|
Resolved | None | T21270 Relative URIs in interwiki links table break interwiki transclusion | |||
Resolved | None | T20664 Relative URIs in interwiki links table causes failed redirects | |||
Resolved | None | T22342 Support for protocol-relative URLs | |||
Resolved | None | T31497 Parser doesn't support protocol relative external links in single-bracketed syntax |
Updated summary to clarify that this is about inline links in wiki text documents.
Bug 20342 is primarily about URLs generated by/for user interface components and whatnot, where currently we would tend to select either 'http' or 'https' forms but are looking for ways to avoid splitting caches so the same HTML output can be stored and used for both.
For wikitext-formatted messages, it _might_ be useful to be able to pass in such URLs directly into things using '[$1 blah]'.
For documents in general, that's a bit more fragile; when we use protocol-relative links we're saying "we know FOR SURE that both this site and the other site we're talking about are available on both http and https, and that the correct thing to do is to send you to the same protocol on the other site".
IMO that's a bit flaky -- folks are probably more likely to accidentally put in a link that doesn't actually work in one mode or the other without testing it correctly -- but it might be a necessary evil.
Many bits of the software pass external links to interface messages, which are then parsed.
There's also stuff like {{fullurl:}} which returns absolute URLs including a protocol prefix, pointing to stuff that's at the local wiki. Now 1) it should be possible to configure this to spit out protocol-relative URLs (is it already?) and 2) the parser should be able to handle them when using the [] with fullurl combo as in:
[{{fullurl:{{FULLPAGENAME}}|action=edit}} Edit this page]
a.d.bergi wrote:
(In reply to comment #1)
For documents in general, that's a bit more fragile; when we use
protocol-relative links we're saying "we know FOR SURE that both this site and
the other site we're talking about are available on both http and https, and
that the correct thing to do is to send you to the same protocol on the other
site".
There are sites we know for sure: our own. In the situation as it is, you will have to use [wikipedia.org wikipedia.org] instead of just //wikipedia.org. Example: http://test2.wikipedia.org/wiki/Special:ExpandTemplates?input=%7B%7BSERVER%7D%7D%0A%0A%5B%7B%7BSERVER%7D%7D%5D%0A%0A%5B%7B%7BSERVER%7D%7D%20%7B%7BSERVER%7D%7D%5D%0A%0A%7B%7Bfullurl%3A%7B%7BPAGENAME%7D%7D%7D%7D%0A%0A%5B%7B%7Bfullurl%3A%7B%7BPAGENAME%7D%7D%7D%7D%5D%0A%0A%5B%7B%7Bfullurl%3A%7B%7BPAGENAME%7D%7D%7D%7D%20%7B%7Bfullurl%3A%7B%7BPAGENAME%7D%7D%7D%7D%5D%0A%0A%7B%7Bfullurl%3A%7B%7BPAGENAME%7D%7D%7Caction%3Dedit%7D%7D%0A%0A%5B%7B%7Bfullurl%3A%7B%7BPAGENAME%7D%7D%7Caction%3Dedit%7D%7D%5D%0A%0A%5B%7B%7Bfullurl%3A%7B%7BPAGENAME%7D%7D%7Caction%3Dedit%7D%7D%20%7B%7Bfullurl%3A%7B%7BPAGENAME%7D%7D%7Caction%3Dedit%7D%7D%5D
This is worse, and explizitly the fullurl:-thing will break a lot. So I think, at least for our own domain(s) we have to enable un-bracketed links.
IMO that's a bit flaky -- folks are probably more likely to accidentally put in
a link that doesn't actually work in one mode or the other without testing it
correctly -- but it might be a necessary evil.
And I'd say it is necessary. Allowing only for spezific domains (settable in config?) would make it more complex than it must be. wgUrlProtocols (the js variable) would need to provide the domains for which protocol and which link syntax will work. Urghh. I think it is much cleaner to allow every site, even if there may happen accidents.
(In reply to comment #5)
(In reply to comment #1)
For documents in general, that's a bit more fragile; when we use
protocol-relative links we're saying "we know FOR SURE that both this site and
the other site we're talking about are available on both http and https, and
that the correct thing to do is to send you to the same protocol on the other
site".There are sites we know for sure: our own. In the situation as it is, you will
have to use [wikipedia.org wikipedia.org] instead of just //wikipedia.org.
[snip]
This is worse, and explizitly the fullurl:-thing will break a lot. So I think,
at least for our own domain(s) we have to enable un-bracketed links.
Yes, using {{fullurl:}} to produce a clean link doesn't work any more. This is known and deliberate.
IMO that's a bit flaky -- folks are probably more likely to accidentally put in
a link that doesn't actually work in one mode or the other without testing it
correctly -- but it might be a necessary evil.And I'd say it is necessary. Allowing only for spezific domains (settable in
config?) would make it more complex than it must be. wgUrlProtocols (the js
variable) would need to provide the domains for which protocol and which link
syntax will work. Urghh. I think it is much cleaner to allow every site, even
if there may happen accidents.
We do allow every site, where are you getting this idea that we're not?
a.d.bergi wrote:
(In reply to comment #6)
(In reply to comment #5)
(In reply to comment #1)
In the situation as it is, you will
have to use [wikipedia.org wikipedia.org] instead of just //wikipedia.org.
This is worse, and explizitly the fullurl:-thing will break a lot. So I think,
at least for our own domain(s) we have to enable un-bracketed links.Yes, using {{fullurl:}} to produce a clean link doesn't work any more. This is
known and deliberate.
Deliberated? It don't think this is good practice. Apart from breaking existing links, it will make linking more user-unfriendly. Who would use [{{fullurl:xyz|abc}} http(s)://xyz?abc]? As a only-fullurl-link doesn't work any more, users will copypaste a protocoll-absolute, correctly (better: as intended) parsed link. Is that userfriendly?
Allowing only for spezific domains (settable in
config?) would make it more complex than it must be. wgUrlProtocols (the js
variable) would need to provide the domains for which protocol and which link
syntax will work. Urghh. I think it is much cleaner to allow every site, even
if there may happen accidents.We do allow every site, where are you getting this idea that we're not?
Yes, but not both link formats. I said that enabling the bracketless format for just a configurable set of sites (the proposed knowing-for-sure domains) wouldn't be better, why don't we allow just everything?
(In reply to comment #7)
(In reply to comment #6)
(In reply to comment #5)
(In reply to comment #1)
In the situation as it is, you will
have to use [wikipedia.org wikipedia.org] instead of just //wikipedia.org.
This is worse, and explizitly the fullurl:-thing will break a lot. So I think,
at least for our own domain(s) we have to enable un-bracketed links.Yes, using {{fullurl:}} to produce a clean link doesn't work any more. This is
known and deliberate.Deliberated? It don't think this is good practice. Apart from breaking existing
links, it will make linking more user-unfriendly. Who would use
[{{fullurl:xyz|abc}} http(s)://xyz?abc]? As a only-fullurl-link doesn't work
any more, users will copypaste a protocoll-absolute, correctly (better: as
intended) parsed link. Is that userfriendly?
Yeah, you're right that it's confusing. I figured that it would also be confusing if //anyWordThatStartsWithTwoSlashes would be linkified automatically, so I disabled that behavior deliberately. But I didn't consider the fullurl use case you brought up.
Yes, but not both link formats. I said that enabling the bracketless format for
just a configurable set of sites (the proposed knowing-for-sure domains)
wouldn't be better, why don't we allow just everything?
I think you may be misinterpreting Brion's words, and I think those words themselves were unclear to begin with. I never suggested limiting protocol-relative URLs to select domains, although it may be a way out of the we-don't-want-every-word-beginning-with-slash-slash-to-be-linked problem.
An instance where you're generating a complete URL to embed as (potentially clickable) full text in email or a web page probably should be in canonical form.
(In reply to comment #9)
An instance where you're generating a complete URL to embed as (potentially
clickable) full text in email or a web page probably should be in canonical
form.
Yes. To expand on that: we now have {{canonicalurl:}} that always outputs a fully-qualified HTTP URL, even when saved or viewed using HTTPS. Earlier this week , Sam and I went through [[MediaWiki:Enotif body]] on all wikis that had overridden it and changed all instances of {{fullurl:}} and {{SERVER}}{{localurl:foo}} to {{canonicalurl:}} because e-mail clients also don't automatically link protocol-relative URLs in text.
Is there the ability to update {{canonicalurl}} to generate an absolute url that is protocol relative to the login type? Alternatively if that is problematic, can there be (yet) another parser function that undertakes the task to generate the url relative to the user? Having to do those sorts of hacks ''ad infinitum'' is surely just courting disaster against a simple ability especially as that {{canonicalurl}} will get used by the lazy, or someone will write templates to get around the issue. Thanks.
(In reply to comment #12)
Is there the ability to update {{canonicalurl}} to generate an absolute url
that is protocol relative to the login type? Alternatively if that is
problematic, can there be (yet) another parser function that undertakes the
task to generate the url relative to the user? Having to do those sorts of
hacks ''ad infinitum'' is surely just courting disaster against a simple
ability especially as that {{canonicalurl}} will get used by the lazy, or
someone will write templates to get around the issue. Thanks.
What do you mean, exactly? If you want a protocol-relative URL, use {{fullurl:}}. This seems to be what you mean with "protocol relative to the login type". There isn't a parser function that outputs http:// URLs for people viewing over http and https:// URLs for people viewing over HTTPS because that would mean we'd have to split the parser cache, but this is exactly what protocol-relative URLs are for.
What do you mean with a "URL relative to the user"? Does that mean generating fully-qualified URLs, e.g. for e-mails that use https if the user logs in over https and http otherwise? Would this be based on the not-yet-existing "Always use HTTPS when I'm logged in" preference?
(In reply to comment #13)
(In reply to comment #12)
Is there the ability to update {{canonicalurl}} to generate an absolute url
that is protocol relative to the login type? Alternatively if that is
problematic, can there be (yet) another parser function that undertakes the
task to generate the url relative to the user? Having to do those sorts of
hacks ''ad infinitum'' is surely just courting disaster against a simple
ability especially as that {{canonicalurl}} will get used by the lazy, or
someone will write templates to get around the issue. Thanks.What do you mean, exactly? If you want a protocol-relative URL, use
{{fullurl:}}. This seems to be what you mean with "protocol relative to the
login type". There isn't a parser function that outputs http:// URLs for people
viewing over http and https:// URLs for people viewing over HTTPS because that
would mean we'd have to split the parser cache, but this is exactly what
protocol-relative URLs are for.What do you mean with a "URL relative to the user"? Does that mean generating
fully-qualified URLs, e.g. for e-mails that use https if the user logs in over
https and http otherwise? Would this be based on the not-yet-existing "Always
use HTTPS when I'm logged in" preference?
I meant what you explained in the first paragraph.
Sometimes it is easiest and more appropriate to display a url. To display a full protocol relative url is problematic, so for
https://en.wikisource.org/w/index.php?title=Wikisource:Sandbox&oldid=3501464
I cannot code
I have to code it as
It would be fantastic if I could code either
and they could exhibit the protocol relative urls
http://en.wikisource.org/w/index.php?title=Wikisource:Sandbox&oldid=3501464
https://en.wikisource.org/w/index.php?title=Wikisource:Sandbox&oldid=3501464
depending on how I login.
If it relates to the second paragraph, then maybe I don't comprehend what happens with http:// after the change. I just see that I keep getting forced into http:// in so many places and don't see an easy (lazy) solution.
(In reply to comment #14)
I have to code it as
- [{{fullurl:Wikisource:Sandbox|oldid=3501464}}
{{fullurl:Wikisource:Sandbox|oldid=3501464}] ;
- [//en.wikisource.org/w/index.php?title=Wikisource:Sandbox&oldid=3501464
//en.wikisource.org/w/index.php?title=Wikisource:Sandbox&oldid=3501464]
Yes, unfortunately you do have to. Protocol-relative URLs aren't linked magically because I thought that would be too error-prone, see comment 8.
It would be fantastic if I could code either
- {{canonicalurl:Wikisource:Sandbox|oldid=3501464}}; or
- {{protocolrelativeurl:Wikisource:Sandbox|oldid=3501464}}
and they could exhibit the protocol relative urls
http://en.wikisource.org/w/index.php?title=Wikisource:Sandbox&oldid=3501464
https://en.wikisource.org/w/index.php?title=Wikisource:Sandbox&oldid=3501464
depending on how I login.
That would be fantastic, yes. It would also be fantastically cache-breaking :(
If it relates to the second paragraph, then maybe I don't comprehend what
happens with http:// after the change. I just see that I keep getting forced
into http:// in so many places and don't see an easy (lazy) solution.
canonicalurl will continue to output http:// URLs, unless and until HTTPS actually becomes the preferred protocol (but my understanding is we don't have the infrastructure for HTTPS-by-default right now). At some point soonish, we will want to introduce a preference with which users can indicate that they want to log in over HTTPS only. Enabling this preference will have the following consequences:
The one that's of particular interest for this bug is:
The "free link syntax" mentioned in the bug title do not works: "//example.com" generates plain text. See comment 5.
(In reply to comment #18)
The "free link syntax" mentioned in the bug title do not works: "//example.com"
generates plain text. See comment 5.
Yes, but:
(In reply to comment #6)
Yes, using {{fullurl:}} to produce a clean link doesn't work any more. This is
known and deliberate.
A plain //foo anywhere will break pages, I'm sure there are countless mentions of it out there where it is not intended as a link but simply as two slashes (e.g. on talk pages about a programming-related subject).
Aside from breaking things, it is also arguably bad user facing. I'm not sure whether the average user is ready to be looking at //example.org and know that it is a link.