Page MenuHomePhabricator

"MediaWiki:Copyright" message allows raw HTML
Open, LowestPublic

Description

[[MediaWiki:Copyright]] still allows raw html input which can be maliciously used by rogue admins by adding <img src="http://my_host/index.php?title=Special:UserLogout"/> to
[[MediaWiki:Copyright]] so everyone will be forcefully logged out.

Did talk to the security responsible dude an age ago (one year ago approx), but nothing seems to have been done to address this issue, nor has any bug been written.

Event Timeline

There are a very large number of changes, so older changes are hidden. Show Older Changes

Forcing everyone to log out sounds like a pretty nice action compared to some stuff that could be done by an admin editing JS.

(In reply to Erwin Dokter from comment #2)

Wikitext doesn't work here because the message, like many other non-content
related UX messages, is not going throught the PHP parser, so it must be raw.

One can write a book on a 1000 ways to disable the site using admin 'powers'.

Anyway, exact same issue as in bug 212, so closing as duplicate.

> *** This bug has been marked as a duplicate of bug 212 ***

I thought bug 212 was a tracking bug (a non-closed sub-1000 bug number)

A sub-1000 bug number does not denote any special meaning; it is not marked as a tracking bug; it is just a very old bug. You already raised this issue there, so there is no point in opning a new bug.

  • This bug has been marked as a duplicate of bug 212 ***

This is a concrete issue we can fix. Bug 212 we cannot fix unless we fix bug 19291.

I see no way to fix this without sending it through the parser.

There are multiple ways rogue admins could abuse, and *this one* doesn't seem the only method to do that. Any site JavaScript or even CSS can be used to do the same.

Hence, removing examples of what can be done from the summary. The problem is not proper escaping, which could break layout, not the abuse that can be done by other means.

It's really pointless moving this to the private section, given that it's been a public bug for over a year and a half.

Sure, but explaining to users how exactly to break the wiki publicly is also not helpful.

This is something that only sysops (who have had more powerful tools than this for ages) can do, it's not a public XSS, so moving it to security was rather pointless.

This is the first I noticed this particular bug. I agree, we can make this public, unless James thinks it's being abused.

For this specific issue, Skin::getCopyright does send the message through the parser, so I think the only reason not to use wikitext is the rel="license" attributes on the links, right? That shouldn't be too difficult to solve.

(In reply to Chris Steipp from comment #12)

This is the first I noticed this particular bug. I agree, we can make this
public, unless James thinks it's being abused.

I moved it because people started to explain exactly how to abuse it as part of arguing that other people should work on it. :-)

That won't stop people discussing it, just non-CC'd people from seeing new comments/changes. The abuseable details are already public and accessible from a simple Google search.

Restricted Application changed the visibility from "acl*security (Project)" to "Custom Policy". · View Herald TranscriptNov 24 2014, 9:27 PM
Restricted Application changed the edit policy from "acl*security (Project)" to "Custom Policy". · View Herald Transcript
Restricted Application changed the visibility from "Custom Policy" to "Custom Policy". · View Herald TranscriptDec 17 2014, 10:18 PM
Restricted Application changed the edit policy from "Custom Policy" to "Custom Policy". · View Herald Transcript
Restricted Application changed the visibility from "Custom Policy" to "Custom Policy". · View Herald TranscriptDec 17 2014, 10:20 PM
Restricted Application changed the edit policy from "Custom Policy" to "Custom Policy". · View Herald Transcript
matmarex changed the edit policy from "Custom Policy" to "Custom Policy".
matmarex changed Security from Software security bug to None.

Urgh, looks like that didn't work actually. Chris, can you fix this please?

I also agree that this should be public. (Just to be clear, it currently isn't, "matmarex changed Security from Security or Sensitive Bug to none" above is a lie.)

Jdforrester-WMF changed the visibility from "Custom Policy" to "Public (No Login Required)".
Jdforrester-WMF changed the edit policy from "Custom Policy" to "All Users".

Fine. I profoundly disagree that giving public step-by-step exploit instructions is a good idea as a way to convince others to patch security issues, however.

I also agree that this should be public. (Just to be clear, it currently isn't, "matmarex changed Security from Security or Sensitive Bug to none" above is a lie.)

It's not a lie. It's just that changing the "Security" dropdown to "none" doesn't reset the visibility or edit policy.

Fine. I profoundly disagree that giving public step-by-step exploit instructions is a good idea as a way to convince others to patch security issues, however.

The fact that a message allows raw HTML isn't in itself a security issue, or T2212 wouldn't be public. As is already noted, admins have many other ways to break things.

Is there any consensus about whether a fix for this needs to be made (i.e., throw it through the parser and make a workaround for the little things)? Or can we close it as invalid considering, as has been noted, this is the least of our concerns with admin powers.

This isn't an invalid bug. While not very security sensitive at this point in time, it is technical debt. We do not support the abilities provided by raw HTML in production. Any and all use cases should be comforted through wikitext (or sanitised Parsoid HTML) already. As such, we should work towards replacing this message with a wikitext-only message.

I agree that we should continue to have messages not be raw HTML, but that is a more general issue, and it does not really justify having an individual bug for every message that needs to be parse-ified.

This isn't an invalid bug. While not very security sensitive at this point in time, it is technical debt. We do not support the abilities provided by raw HTML in production. Any and all use cases should be comforted through wikitext (or sanitised Parsoid HTML) already. As such, we should work towards replacing this message with a wikitext-only message.

I think some people use the raw html of that message as an easy way to include analytics code. Enwikinews used to include it to add CC metadata tags so that it shows up in CC search engines. But honestly, I don't think that's a use case we should care about.

I agree it should probably be changed, I agree its pretty low priority, but I think it legitimately could have its own bug given that it is one of the most prominent raw html messages and its one of the raw html messages that's included on almost all pages.

Raw html messages are almost gone now (at least in core). It makes sense to track the remaining ones individually, which are the most complicated ones to fix.

I think some people use the raw html of that message as an easy way to include analytics code. Enwikinews used to include it to add CC metadata tags so that it shows up in CC search engines. But honestly, I don't think that's a use case we should care about.

The former is an invalid use case that would violate the Privacy policy.

The latter is something one should request on Phabricator for the software to provide (be it MediaWiki core, or an extension).

Other wikis may have different privacy policies :)

Yes, to be clear, I meant third parties might use it for analytics code (albeit i have no idea if anyone does so in practise)

With the interface-admin proposal (T190015) progressing, I hope it is apparent that progress needs to be made on this task at the same time.

Here's an idea for how to solve this issue/vulnerability:

  1. Replace the copyright message with a new message copyrightfooter or similar, which would be standard parsed wikitext. Given the aggressive message caching built into MediaWiki, this shouldn't cause any undue additional load on the parser.
  2. Similarly, replace the wikimedia-copyright message in the WikimediaMessages extension with wikimedia-copyrightfooter.
  3. To handle the use case of wikis using this message to place Google Analytics tracking beacons etc, either:
    • Create a config variable $wgExtraHtml which can be set to some custom HTML served at the bottom of every page. This would shift the ability to add analytics scripts from wiki sysops to server admins. The only way to restore this ability to wiki sysops would be by writing a custom hook (or a very simple extension).
    • Create a config variable $wgAllowExtraHtml = false, which when set to true, appends the contents of MediaWiki:Extrahtml to the end of the page. This way, wiki sysops can continue to exert control over analytics scripts on wikis where this is desired, while providing safety by default for all other wikis.

Challenges:

  • Wikis could face legal implications if they are displaying the wrong copyright message for even a brief period of time. Some kind of migration step might be required in the MediaWiki updater that converts the HTML contents of copyright into wikitext in copyrightfooter. It might convert <a> links into appropriate internal/external links, and strip all other tags.
  • Some wikis use <a rel="license" ...> in this message. The parser cannot currently emit this attribute. A new magic word might need to be defined, like {{#licenselink:https://example.org/license|Example License}}.
  • Is there evidence to suggest that people are actually using this for analytics scripts? Or was that just a hypothetical?

My more minimalist short-term solution would be to just create a static list of messages which are known to contain raw HTML, and on a title match / subpage match require editsitejs.
(In the longer term, maybe the parsing method should be a global property of the message, not decided by the caller, and edit permissions assigned automatically based on that.)

In T45646#4447533, @Tgr wrote:

My more minimalist short-term solution would be to just create a static list of messages which are known to contain raw HTML, and on a title match / subpage match require editsitejs.

Heh, yes, I didn't think of that. That would certainly be a better short-term solution, particularly as these messages need to be changed extremely infrequently.

(In the longer term, maybe the parsing method should be a global property of the message, not decided by the caller, and edit permissions assigned automatically based on that.)

Ideally in the long term we would not have any raw HTML messages :)

In T45646#4447533, @Tgr wrote:

(In the longer term, maybe the parsing method should be a global property of the message, not decided by the caller, and edit permissions assigned automatically based on that.)

It's not that simple. Whether a message accessed as ->plain() or ->text() has raw HTML, wikitext, or plain text depends on how the output is subsequently used.

In T45646#4447303, @TTO wrote:
  • Create a config variable $wgExtraHtml which can be set to some custom HTML served at the bottom of every page. This would shift the ability to add analytics scripts from wiki sysops to server admins. The only way to restore this ability to wiki sysops would be by writing a custom hook (or a very simple extension).

We don't need a new variable for this. There are plenty of methods and hooks already for this to be done from LocalSettings.php and/or an extension. Including:

  • BeforePageDisplay hook - call $out->addScript() with raw HTML.
  • SkinAfterBottomScripts hook - append raw HTML to $text.
  • SkinBuildSidebar hook - add a key to $bar with raw HTML.

These can all be done with a one-liner from LocalSettings.php as well.

  • Wikis could face legal implications if they are displaying the wrong copyright message for even a brief period of time. Some kind of migration step might be required in the MediaWiki updater that converts the HTML contents of copyright into wikitext in copyrightfooter. It might convert <a> links into appropriate internal/external links, and strip all other tags.

Rather than converting one into the other, I'd recommend using the current logic (unchanged) during the migration step, but providing a way to disable it for security reasons.

For example, we would introduce the copyrightfooter message, and change the old copyright message to be empty/disabled by default (containing -). If at run-time the old message doesn't exist, display the new one as wikitext (not raw HTML). If the old one does have an override, use it as raw HTML but only if the feature wasn't disabled. E.g. wgAllowCopyrightHtml or some such, which would be true by default for one release cycle, during which it would trigger a deprecation warning for developers. Then, in the next release, the variable and the support for copyright would be removed.

This way:

  • No risk or complication during upgrade for developers (behaviour remains the same, and if your wiki depends on old copyright it will still work, and you get a warning.)
  • After the upgrade, they have until the next release to migrate at-ease from copyright to copyrightfooter and/or a hook-based approach.
  • Developers may disable wgAllowCopyrightHtml to start benefitting the increased security as soon as possible (including WMF).
  • In the upgrade, the variable and support for copyright will be removed.

Why not simply change the usage of the current message as to not output raw HTML? Wikis which don't have the message customized, will continue to work. Wikis that use raw HTML, will display visible HTML code, but that will be also very noticeable for site admins that can adapt the message. This will prevent the problem of displaying a wrong copyright message: it would be correct, but badly formatted, only for the cases where non-allowed HTML tags are used.

The deprecation/warning path for next release and final substitution on a later release can go unnoticed for those that skip versions, or upgrade only from LTS to LTS.

Change 449626 had a related patch set uploaded (by Gergő Tisza; owner: Gergő Tisza):
[mediawiki/core@master] Require editsitecss/editsitejs for editing raw messages

https://gerrit.wikimedia.org/r/449626

@Ciencia_Al_Poder In principle, we aim that site administrator are always able to upgrade without immediate breakage. An administrator should then address any deprecation warnings before the next upgrade in order to keep free of breakage.

In other words: Breakage should not happen unless the removed behaviour was already deprecated/avoidable in the previous release, with the exception of core maintainers agreeing beforehand that deprecation is not needed (e.g. due to the behaviour being unused, rarely used, or infeasible/impossible to deprecate).

Okay, my comment was about the T45646#4447303 idea, which would be a major breakage in legal terms. Gerrit changeset 449626 would be fine, though, if it doesn't follow the path of renaming system messages.

Change 449689 had a related patch set uploaded (by Gergő Tisza; owner: Gergő Tisza):
[mediawiki/core@master] Replace raw HTML copyright footer message with wikitext one

https://gerrit.wikimedia.org/r/449689

Change 449626 merged by jenkins-bot:
[mediawiki/core@master] Require editsitecss/editsitejs for editing raw messages

https://gerrit.wikimedia.org/r/449626

What happened here with https://gerrit.wikimedia.org/r/c/449689 ?

Someone fixed this but then it got conflicted and notfixed?

What happened here with https://gerrit.wikimedia.org/r/c/449689 ?

Someone fixed this but then it got conflicted and notfixed?

Someone proposed a fix, however nobody reviewed it so it never got merged.

Change 449689 abandoned by SBassett:

[mediawiki/core@master] Replace raw HTML copyright footer message with wikitext one

Reason:

This is so out-of-date at this point (the Skin.php code isn't remotely similar these days) that it's likely worth a complete revisit.

https://gerrit.wikimedia.org/r/449689