Page MenuHomePhabricator

API generates invalid HTML for format=jsonfm
Closed, ResolvedPublic

Description

To reproduce, go to https://en.wikipedia.org/w/api.php?action=query&titles=%3C&format=jsonfm.

Everything after the < is colored blue, and the HTML is obviously invalid:

[...]
<pre style='white-space: pre-wrap;'>
{

&quot;query&quot;: {
    &quot;pages&quot;: {
        &quot;-1&quot;: {
            &quot;title&quot;: &quot;<span style="color:blue;">&lt;&quot;,
            &quot;invalid&quot;: &quot;&quot;
        }
    }
}

}
</pre>
[...]


Version: 1.24rc
Severity: minor
URL: https://en.wikipedia.org/w/api.php?action=query&titles=%3C&format=jsonfm

Details

Reference
bz65403

Event Timeline

bzimport raised the priority of this task from to Low.Nov 22 2014, 3:17 AM
bzimport set Reference to bz65403.
bzimport added a subscriber: Unknown Object (MLST).

Change 133739 had a related patch set uploaded by PleaseStand:
API: Skip all HTML transformations for non-XML formats

https://gerrit.wikimedia.org/r/133739

Change 161093 had a related patch set uploaded by Anomie:
API: Clean up and internationalize pretty-printed output

https://gerrit.wikimedia.org/r/161093

Change 133739 merged by jenkins-bot:
API: Remove XML tag highlighting from non-XML formats

https://gerrit.wikimedia.org/r/133739

Well, the HTML is still invalid, though at least the tags should now be balanced.

For example, http://validator.w3.org/check?uri=https%3A%2F%2Fwww.mediawiki.org%2Fw%2Fapi.php%3Faction%3Dquery%26prop%3Drevisions%26rvprop%3Dcontent%26titles%3DAPI%3AMain_page%26format%3Djsonfm&charset=%28detect+automatically%29&doctype=Inline&group=0:


Error Line 34, Column 2302: Bad value http://en.wikipedia.org/w/api.php?format=json&action=query&titles=Main%20Page&prop=revisions&rvprop=content\n for attribute href on element a: Illegal character in query: not a URL code point.

…p;prop=revisions&amp;rvprop=content\n">http://en.wikipedia.org/w/api.php?forma…

Syntax of URL:

Any URL. For example: /hello, #canvas, or http://example.org/. Characters should be represented in NFC and spaces should be escaped as %20.

Error Line 44, Column 8: Stray start tag script.

<script>if(window.mw){


Error Line 44, Column 8: Cannot recover after last error. Any further errors will be ignored.

<script>if(window.mw){


The aforementioned (comment 2) pretty-printing cleanup *should* fix these, though at the time of writing, the current version of the patch still adds the script element after the </html>.

Change 161093 merged by jenkins-bot:
API: Clean up and internationalize pretty-printed output

https://gerrit.wikimedia.org/r/161093

Should be deployed to WMF wikis with 1.25wmf4, see https://www.mediawiki.org/wiki/MediaWiki_1.25/Roadmap for the schedule.