Page MenuHomePhabricator

TemplateData: page_props limits value length to 65535 bytes (MySQL 'blob' field)
Closed, ResolvedPublic

Description

TemplateData for [[pl:Template:Związek chemiczny infobox]] is not shown in VE. No idea why. The prop is visible on [[pl:Special:PagesWithProp]], so this must be a VE bug.


Version: master
Severity: major
See Also:
https://bugzilla.wikimedia.org/show_bug.cgi?id=52582

Details

Reference
bz51740

Event Timeline

bzimport raised the priority of this task from to High.Nov 22 2014, 1:54 AM
bzimport added a project: TemplateData.
bzimport set Reference to bz51740.
  • Bug 51671 has been marked as a duplicate of this bug. ***

Whoops, wrong bug, please disregard the comment above.

In the background is the following response from the server:

{"servedby":"mw1205","error":{"code":"templatedata-corrupt","info":"Page #837180 templatedata contains invalid data:Syntax error in JSON."}}

https://pl.wikipedia.org/w/api.php?format=jsonfm&action=templatedata&titles=Template:Zwi%C4%85zek_chemiczny_infobox

It is registered

https://pl.wikipedia.org/w/index.php?title=Specjalna:PagesWithProp/templatedata&limit=500&offset=0

This looks like the syntax error:

https://pl.wikipedia.org/w/index.php?title=Szablon:Zwi%C4%85zek_chemiczny_infobox/opis&diff=37123204&oldid=37115636

That diff needs to be approved, and check the update appears at:

https://pl.wikipedia.org/w/index.php?title=Specjalna:PagesWithProp/templatedata&limit=500&offset=0

(If that doesnt work, Maybe '?' is not allowed? *shrug*)

Thanks John; I approved that revision, but it doesn't seem to have solved the issue :(

I notice that the PagesWithProps listing ends at:

je\u015bli podano warto\u015b\u0107 w parametrze \u201e3. napi\u0119cie powierzchniowe\u201d)."},"type"

Maybe the property has a buffer limit.

I have remove many parameters, and that appears to have fixed the problem

http://www.sadtrombone.com/ …nice catch.

TemplateData is using the page_props table, whose pp_value field is a
'blob', which is limited to 65535 bytes [1]. Of course MySQL, being
MySQL, doesn't complain and just silently truncates the data on
insert.

I see three ways to do something about this:

  • If over 65535 bytes, show a big fat red error to the user instead of silently failing.
  • Store compressed data. JSON compresses well due to the redundancy in object keys and due to being mostly text in this case. This would probably raise the effective maximal length a few times and should be easy to implement.
  • If over 65535 bytes, chunk the data in multiple properties 'templatedata1', 'templatedata2' etc., with some marker stored in regular 'templatedata'. Handle this transparently in the API. Nasty, but would work.

Adjusting bug summary and component accordingly.

[1] https://www.mediawiki.org/wiki/Manual:Page_props_table

Option 4: store the JSON in a dedicated wikipage, and fiddle with TemplateData so it supports it.

I chatted briefly with Krinkle and I'll try to implement solution #1 and #2.

I have no preference on #4, but it would be a large-ish architectural change now and would probably require migrating all of the templatedatas already added on wikis to the new format (or supporting both ways).

(In reply to comment #7)

Option 4: store the JSON in a dedicated wikipage, and fiddle with
TemplateData so it supports it.

This is what I suggested on bug 50512 (to solve another problem).

Change 75324 had a related patch set uploaded by Matmarex:
Bail when JSON length exceeds database limits

https://gerrit.wikimedia.org/r/75324

Change 75330 had a related patch set uploaded by Matmarex:
Store compressed JSON since size is limited

https://gerrit.wikimedia.org/r/75330

I tested the patches above with [[pl:Template:Związek chemiczny infobox]]: size of raw processed JSON is ~112 kB, size of compressed JSON is ~7.5 kB. Looks good enough for practical use.

Change 75324 merged by jenkins-bot:
Bail when JSON length exceeds database limits

https://gerrit.wikimedia.org/r/75324

This problem can occur with templates with a large number of numbered fields
Just a raw skeleton without descriptions for http://en.wikipedia.org/wiki/Template:Infobox_officeholder
take 174,407 bytes, as it has very many numbered fields vicepresident2 ... vicepresident14 etc. Fixing bug 52582 allowing autonumbered parameters might solve problems when this occurs.

Change 75330 merged by jenkins-bot:
Store compressed JSON since size is limited

https://gerrit.wikimedia.org/r/75330