Page MenuHomePhabricator

Large generator results in XML give error instead of truncating
Closed, ResolvedPublic

Description

Generator results that are in excess of $wgAPIMaxResultSize will cause an error in XML rather than truncating. This seems to be occuring when the Page elements themselves cause the result size to be in excess of the maximum.

For example, setting $wgAPIMaxResultSize to some arbitrarily low value for demonstration, like 100,000, and doing a standard generator=allpages query, instead of results, you get:

<error code="internal_api_error_MWException" info="Exception Caught: Internal error in ApiFormatXml::recXmlPrint: (pages, ...) has integer keys without _element value. Use ApiResult::setIndexedTagName()." xml:space="preserve">

Interestingly, this occurs regardless of what properties you request, and only occurs with gaplimits over a certain value, regardless of the actual result size (though the gaplimit required to cause the error increases as the max result size does). Both XML and JSON will happily return result sets well above the designated result size (which could be considered another bug), but JSON always seems to return results successfully, where XML fails beyond a certain gaplimit.

Since this becomes nearly impossible to replicate with more normal values of $wgAPIMaxResultSize, I'm marking it as trivial.


Version: 1.22.2
Severity: trivial

Details

Reference
bz68989

Event Timeline

bzimport raised the priority of this task from to Lowest.Nov 22 2014, 3:26 AM
bzimport set Reference to bz68989.
bzimport added a subscriber: Unknown Object (MLST).

Woops, ignore that first paragraph. I'd meant to rewrite it before submitting. The details later are more reflective of the actual behaviour.

The problem isn't that it's hitting the result size limit, it's that something is exiting early and not setting the metadata that the XML formatter requires.

Without information on the specific query you're seeing the problem with, it's difficult to debug this further. I haven't been able to reproduce it locally.

(In reply to Robert Morley from comment #0)
Both XML and JSON will happily return result sets

well above the designated result size (which could be considered another
bug)

$wgAPIMaxResultSize isn't trying to limit the size of the entire response, it's trying to limit the amount of data being returned so someone can't cause issues by doing something like requesting rvprop=content for numerous enormous pages.

As a simple test, with $wgAPIMaxResultSize set to 100,000, I was using:

?action=query&generator=allpages&prop=info&gaplimit=5000

It gives the truncation warning and then errors out. Reducing the gaplimit to 2500, the query worked and gave the expected truncation warning, but no error. I'll e-mail you the actual server, but I don't want to post it publicly just so everybody doesn't go hitting it. It's a development server that's not really meant for everybody and their dog to be launching massive queries. :)

Ok, I see what's going on here. The 'pages' array is supposed to be set up by ApiQuery, but your too-small value for $wgAPIMaxResultSize is preventing that. Then prop=info adds some entries to the 'pages' array, assuming it was already set up.

Unless you've set $wgMaxArticleSize to very low, your value for $wgAPIMaxResultSize is lower than is even supported. But it would probably be good for the API to actually check it.

Change 151103 had a related patch set uploaded by Anomie:
Check for result size failure in ApiQuery

https://gerrit.wikimedia.org/r/151103

(In reply to Brad Jorsch from comment #4)
Thanks for looking into it. I figured the low value of $wgAPIMaxResultSize was coming into play, as the maximum gaplimit value increased as the max size did. I had noticed that the recommended lower limit was related to $wgMaxArticleSize, but I didn't realize that it would have any significant effect on a query; I assumed that it had more to do with being able to return at least one revision when requested.

This is what I get for trying to test my own generator module...I make everything fail! ;) At least now I know that it's nothing I have to worry about in normal usage. Thanks again!

Change 151103 merged by jenkins-bot:
Check for result size failure in ApiQuery

https://gerrit.wikimedia.org/r/151103

This is now fixed in the current git master.