Author: rodrigosprimo
Description:
Patch to fix issues on Mediawiki DTD
Hi,
I'm trying to validate a Mediawiki XML file against its DTD file but the validation is failing. I'm trying using PHP DOMDocument, I haven't tried the validation with other tools so I can't be sure if the problem is on PHP or Mediawiki XML file, but I guess it is more likely to be on Mediawiki.
I'm testing with the attached script (testMediawikiXml.php). When I try to validate the XML from http://en.wikipedia.org/wiki/Special:Export/Train I get the following error:
Element '{http://www.w3.org/2001/XMLSchema}element': The attribute 'name' is required but missing
This error can be fixed by commenting line 119 of http://www.mediawiki.org/xml/export-0.4.xsd. The content of this line is:
<element minOccurs="0" maxOccurs="1" type="mw:DiscussionThreadingInfo" />
I guess the best solution is to add the "name" attribute but I haven't investigate and I don't know much about DTD to know what should be the value of the "name" attribute.
If I try to run the script again another error occurs:
Element '{http://www.mediawiki.org/xml/export-0.4/}namespace', attribute 'case': The attribute 'case' is not allowed.
To fix this one I have added the following line below line 92:
<attribute name="case" type="string" />
After those two changes to the DTD file I'm able to validate the XML file. I'm attaching the script I'm using to test and a patch with the changes I made to the DTD file. I guess that the second change is ok but the first issue need to be properly fixed (instead of just commenting the line).
Thanks, Rodrigo.
Version: unspecified
Severity: normal
Attached: