This is important since the extracts are currently generated with the old API which doesn't guarantee valid xhtml5. That prevents third-party developers using less forgiving markup environments from fetching the extracts to build tools on top of them, e.g. https://github.com/waldyrious/primerpedia/issues/20
I'd work on this myself but I really don't have the background to understand the code in ApiQueryExtracts.php on my own (even after taking a look at VisualEditor's ApiVisualEditor.php to check an example implementation of a request to Parsoid, as suggested by Mark Traceur).
Version: unspecified
Severity: enhancement