Page MenuHomePhabricator

Allow anchor (<a>) tags in wikitext (currently prevented)
Closed, DuplicatePublic

Description

Author: FT2.wiki

Description:
Can the anchor markup <a>...</a> be allowed in wiki markup?

Although there are several uses, one reason is to create additional TOC entries via <a id="title"></a> rather than <span></span>, which results in the same HTML tag as == title == will use.


Version: unspecified
Severity: enhancement

Details

Reference
bz18460

Event Timeline

bzimport raised the priority of this task from to Low.Nov 21 2014, 10:36 PM
bzimport set Reference to bz18460.
bzimport added a subscriber: Unknown Object (MLST).
  • Bug 26727 has been marked as a duplicate of this bug. ***

(Copied from bug 26727#c0)
Currently, Sanitizer.php bans the use of an <a> tag, even though it's possible
to mostly replicate the effect of an <a> tag using internal or external link
markup (e.g., "[[foo]]" or "[http://example.com foo]"). However, wikimarkup
does not allow for certain attributes to be passed to the rendered <a> tag, for
example title="" and target="".

Obviously a whitelist/blacklist would be needed to allow or disallow certain
attributes from being used, however I believe some of this sanitation is
already built in to MediaWiki or could easily be as necessary. Allowing the <a>
tag has a number of benefits and it's the usual practice to allow the HTML
equivalents of markup that's allowed in wikitext.

for example title="" and target=""

Would we really want to allow target? I'm personally not a fan of pop ups.

(In reply to comment #3)

for example title="" and target=""

Would we really want to allow target? I'm personally not a fan of pop ups.

That's probably something that could have a sensible default with a configuration variable to override. An array of blacklisted/whitelisted attributes that could be overridden on a per-wiki basis. I'm personally more concerned that there is currently a blanket ban on the <a> HTML element, banning even simple use-cases like <a href="http://en.wikipedia.org">my link</a> (making, for example, copying and pasting horribly difficult).

Titles/tooltips: <span title>
Jump-to-anchors: <span id>
LInk: [http:// ]

All other attributes to anchor tags I think are either unwanted, risky or redundant.
Further more they would greatly make the wikitext more complex and non-standard.

Wikitext is characterbased ('', * {| {{ etc.) with some <xml> but introducing more HTML is not something I think should even be considered.

(In reply to comment #5)

Titles/tooltips: <span title>
Jump-to-anchors: <span id>
LInk: [http:// ]

All other attributes to anchor tags I think are either unwanted, risky or
redundant.

It's also not trivial to whitelist <a> only for name= but not for other purposes (links).

Marking WONTFIX.

(In reply to comment #5)

Titles/tooltips: <span title>
Jump-to-anchors: <span id>

You're aware that MediaWiki itself uses <a>, not <span> for anchors? For example, in the page source there is "<a id="top"></a>". I can't remember the exact reason why this was implemented in this way, but I do recall there being one.

LInk: [http:// ]

All other attributes to anchor tags I think are either unwanted, risky or
redundant.

That's a lot of attributes you're talking about. People regularly come into MediaWiki and want to add target="" capabilities, for example. So much so that there is a configuration variable ($wgExternalLinkTarget) that modifies the target for all links. This is obviously sub-optimal if there are only particular links that a user wishes to have a target="".

I don't think anyone objects to banning the risky attributes. I object to banning non-risky attributes like id="", title="", and href="" (once you strip out dangerous pieces like "javascript:"), though.

Further more they would greatly make the wikitext more complex and
non-standard.

Wikitext is characterbased ('', * {| {{ etc.) with some <xml> but introducing
more HTML is not something I think should even be considered.

You suggested using <span title="a link">[http://example.com/ example]</span>. Do you really believe that's better for users than <a href="http://example.com/" title="a link">example</a>? There is going to be some HTML in wikitext no matter what. I suggest using standard code, not forcing users to use workarounds and hacks in order to fight the sanitizer/parser, when these attributes are perfectly valid and safe.

(In reply to comment #6)

It's also not trivial to whitelist <a> only for name= but not for other
purposes (links).

The level of triviality doesn't matter. There are definitely legitimate use-cases for using <a> and it breaks the standard that rendered wikitext HTML be valid in actual HTML. (Are there any other HTML elements that fall outside this rule?)

The reason that I suggested adding this feature wouldn't be very difficult is that attributes are already stripped in certain elements. For example, "<span onmouseover="alert('crap')">Oh no</span>" becomes "<span>Oh no</span>" currently.

Marking WONTFIX.

Re-opening.

Another use-case to consider is that people may want this functionality so much that they're willing to modify Sanitizer.php to allow the <a> HTML element, not realizing the potential risks (<a> seems like a fairly simple tag on the surface). MediaWiki should provide a way to enable this element in a sane, safe way.

(In reply to comment #7)

(In reply to comment #5)

Titles/tooltips: <span title>
Jump-to-anchors: <span id>

You're aware that MediaWiki itself uses <a>, not <span> for anchors? For
example, in the page source there is "<a id="top"></a>". I can't remember the
exact reason why this was implemented in this way, but I do recall there being
one.

AFAIK that was to keep BC with IE something (5.x?) which we no longer support that different like using another method. As for your example, That is the only one I can quickly see in the page source I'm looking at (A fairly simple low content page (Cop_This! on en.wikip)) and everywhere else appears to use divs or spans for their id'ing and anchoring.

That example also seems redundant (at least in vector) because the content div is declared directly on top of it and we double id that to be both content and top if we really wanted to.

eg a section header w/ edit link

<h2><span class="editsection">[<a href="/w/index.php?title=Cop_This!&amp;action=edit&amp;section=1" title="Edit section: Synopsis">edit</a>]</span> <span class="mw-headline" id="Synopsis">Synopsis</span></h2>

As for the <a title=""> part of this discussion, I believe that is decrapulated by the current standards, and when used with most (if not all?) browsers when wrapping text it will be displayed as a hyper-link even if not pointing somewhere. The only benefit of it, That i can think of is that its shorter.

The other usecase i could see is to make the microformat people happy.

MediaWiki use <span id> internally as well:

<h2><span class="editsection">[<a href="..." title="Edit section: Foobar">edit</a>]</span> <span class="mw-headline" id="Foobar">Foobar</span></h2>

and <a name> is deprecated as of HTML5. Will work for a few years but it's deprecated.

(In reply to comment #7)

Further more they would greatly make the wikitext more complex and
non-standard.

Wikitext is characterbased ('', * {| {{ etc.) with some <xml> but introducing
more HTML is not something I think should even be considered.

You suggested using <span title="a link">[http://example.com/ example]</span>.
Do you really believe that's better for users than <a
href="http://example.com/" title="a link">example</a>?

Yes I do and pretty much know based on what I've seen in numerous usability testing reviews, videos and observation.

Another point: This would cause there to be two ways to make a link. People seeing <a href="link">text</a> will never be able to figure out what it means unless we document it together with documentation of [link text].

And yes I know there are two ways to make a table ({| and <table>) but that doesn't justify adding more.

(In reply to comment #11)

And yes I know there are two ways to make a table ({| and <table>) but that
doesn't justify adding more.

There are two ways to do almost _any_ markup (that has a wikitext equivalent). <h2> can be used instead of == ==, {| instead of <table>, ''' and '' instead of <b> and <i>, and the list goes on and on.

But some formatting (e.g., <sup> and <s>) can _only_ be done in HTML. HTML is inherently part of wikitext ("wikitext" being "the text inside the <textarea> in an edit window" in this context). Suggesting that the most common HTML element (<a>) be banned because it has a wikitext equivalent is crazy. Will we blacklist <b> and <i> next? Those have some of the oldest wikimarkup equivalents and surely it's simpler for users to have only one way to make text bold or italic.

I don't see how breaking things like users copying and pasting HTML into an edit window makes the software more usable. It just seems to be a huge inconvenience for a lot of users and given that some HTML is already allowed (and sometimes required), it makes the entire concept of editing more confusing to ban an element like <a>.

This seems to be the same as bug #33886, which has a review pending. Therefore I shall boldly resolve as duplicate.

*** This bug has been marked as a duplicate of bug 33886 ***