Page MenuHomePhabricator

content-holding <div> should only contain the page text
Closed, ResolvedPublic8 Estimated Story Points

Description

Author: a.d.bergi

Description:
The new bidirectionality-improving div should only contain the parsed page content, nothing else. This is perfect when viewing articles, and helps scripts a lot when extracting that content.

But I think it is only needed for article views. For example in diff mode, it contains

<form id="mw-fr-reviewform">
<div id="mw-fr-diff-headeritems">
<table class="diff">
<hr class="diff-hr"/>
<h2 class="diff-currentversion-title">

...and then the real content

Also, in edit mode the <div id="mw-content-text"> contains

<div id="editnotice-ns-0"/>
<div id="mw-edit-longpage-hint">
<div id="wikiPreview" class="ontop" style="display: none;"/>
<form id="editform">

None of them is really content-text, I suspect. I think it would be better to have the div only appear (maybe empty) in the #wikiPreview.

Also, it for example broke my script which tried to

document.getElementById("bodyContent").insertBefore(newElement, document.getElementById("editform"));

Related Objects

Event Timeline

bzimport raised the priority of this task from to Medium.Nov 22 2014, 12:10 AM
bzimport set Reference to bz35247.
bzimport added a subscriber: Unknown Object (MLST).

(In reply to comment #0)

Also, it for example broke my script which tried to
document.getElementById("bodyContent").insertBefore(newElement,document.getElementById("editform"));

Please elaborate what you mean here.

a.d.bergi wrote:

Also, it for example broke my script which tried to
document.getElementById("bodyContent").insertBefore(newElement,
document.getElementById("editform"));

Please elaborate what you mean here.

I have a script that inserted some edit tools into #bodyContent, right before #editform. Since r111647 that doesn't work, as #editform now is a child of #mw-content-text.

OK, it was a minor issue and very easy to fix. I only wanted to mention that some tools may be broken that make assumptions about the child nodes of mw.util.$content, which might be quite common.

Anomie subscribed.

I've become interested in this thanks to TemplateStyles. Currently, TemplateStyles scopes all styles to the id="mw-content-text" element to prevent styles from accidentally messing with site UI. I just now loaded a random page in VE and noticed that its content being edited is not within that element, so the template styles won't be applied in the editor. This isn't good.

At the same time, we have this bug complaining that the id="mw-content-text" element sometimes contains more than just the page output, for example it contains the diff tables, the edit form, category text, image metadata, and so on. The mw-content-ltr and mw-content-rtl classes mentioned here do slightly better, they don't hold the diff tables but do hold parts of category lists, the entire Flow page, and so on. And having to target multiple classes would bloat TemplateStyles's output.

It looks like the most straightforward option to getting just the parser output to be wrapped in some sort of container is to do just that: stick some code like this in near this line in the Parser:

# Wrap non-interface ParserOutput in a class
if ( !$this->mOptions->getInterfaceMessage() ) {
    $text = Html::rawElement( 'div', [ 'class' => 'mw-parser-output' ], $text );
}

I don't know whether VisualEditor and StructuredDiscussions will get fixed up by that, though, or if they'll need some other work. I tried to set them up in mediawiki-vagrant, but neither actually functioned.

If we don't do that, we may be able to get by with a few more changes where ParserOutput is used.

  • OutputPage::addParserOutputText(), either before or after the 'OutputPageBeforeHTML' hook call, will catch most cases.
    • We'll probably need to copy the interface flag into the ParserOutput so that can only wrap the non-interface ParserOutputs.
  • EditPage::doPreviewParse() needs to wrap its HTML too.
  • Possibly the "Empty content container for LivePreview" in EditPage::displayPreviewArea() should have the class added.
  • TemplateSandbox::templateSandboxPreview() where it basically duplicates EditPage::getPreviewText()
  • Translate::getDocumentationBox() where it displays the qqq message
  • Wherever in Flow that it generates the flow-post-content div.
  • VisualEditor would probably need to add the class to its wrapper div, probably the same one that already has the 'mw-content-ltr' class.

Maybe you could have a ParserOption for it, so that the parser tests can still pass without too much trouble? Have the wrapper be on by default, but turn it off for parser tests. Other than that, I think it should be fine to add it to Parser::parse(). Write the patch and we'll see what happens.

Change 350634 had a related patch set uploaded (by Anomie; owner: Anomie):
[mediawiki/core@master] Wrap parser output in <div class="mw-parser-output">

https://gerrit.wikimedia.org/r/350634

Change 350635 had a related patch set uploaded (by Anomie; owner: Anomie):
[mediawiki/extensions/MobileFrontend@master] Update tests for core change If4eb5bf7

https://gerrit.wikimedia.org/r/350635

Change 350636 had a related patch set uploaded (by Anomie; owner: Anomie):
[mediawiki/extensions/Wikibase@master] Update tests for core change If4eb5bf7

https://gerrit.wikimedia.org/r/350636

Adding @bearND, @Mholloway, @JoeWalsh, @Jdlrobson, @MarkTraceur for visibility - you most likely want to double check that CSS selector expressions keep working in reading and, to the extent they're employed in the experiences you engineer, editing contexts. There may be no problem, but I would be remiss to not note it.

CC @Fjalapeno

Change 350636 merged by jenkins-bot:
[mediawiki/extensions/Wikibase@master] Update tests for core change If4eb5bf7

https://gerrit.wikimedia.org/r/350636

Does it have to be a class (instead of an id) because of Flow?

Flow or anything else that might want to put two "pages" of content in one HTML document. For example, a side-by-side comparison of the rendered text might make sense somewhere, maybe for translation or something.

you most likely want to double check that CSS selector expressions keep working in reading and, to the extent they're employed in the experiences you engineer, editing contexts. There may be no problem, but I would be remiss to not note it.

As far as selectors, you're probably good unless you're using the direct-decendent > selector. Potentially more interesting would be things like MobileFrontend that try to chop up the parser output (although for MobileFrontend specifically I already submitted this patch).

Change 350635 merged by jenkins-bot:
[mediawiki/extensions/MobileFrontend@master] Update tests for core change If4eb5bf7

https://gerrit.wikimedia.org/r/350635

It's not clear to me why a separate class is needed here. How is mw-parser-output semantically different from mw-body-content?

I assume that mw-parser-output would solely be applied to output that literally comes from the parser, no exception. However while that is a clear use case, the use case for styling that output seems slightly wider. Because the purpose of mw-body-content right now is for anything that is meant to be styled the same way as page content.

While these are different technical purposes, I'd like to make sure we avoid cases where one set mw-parser-output to mimic certain styles, or alternatively where one would have to target both in CSS to make sure it always applies.

A few questions to help narrow it down:

  1. Is it valid for MediaWiki core interface styles and skin styles to ever target mw-parser-output? I propose no, they must use mw-body-content instead.
  2. Do we support outputting mw-parser-output on a MediaWiki-powered HTML response without being a descendant of mw-body-content? I propose no, an element with mw-parser-output must either also have mw-body-content or be a descendant of an element with that class.

This leaves the question: What is the risk of allowing TemplateStyles to target mw-body-content? The answer, as I understand it, is that we have some interface components (like page heading, categories, notices/banners, Translate extension controls, FlaggedRevs extension controls) that should be styled the same way as page content (e.g. their headings and links etc.) but don't always come straight out of the parser. And presumably we want to avoid these elements from being styled by a template on the current page.

In that case I have a few more questions:

  • Would it be appropiate to instead have these common styles use mw-body instead of mw-body-content in their CSS? E.g. for skin headings, links etc. Thus turning mw-body-content into the proposed mw-parser-output. One problem with this is that Parsoid prefers to only use mw-body-content, and not mw-body, and still load Vector styles and have a page that looks like it was cut out from a Vector-skin page view but without the skin wrapping. Applying mw-body has the side effect of needlessly reserving space for the sidebar and such. Although it would be trivial to fix Vector to move these styles to a vector-specific class name on the same element since strictly speaking this should not have been applied to mw-body in the first place.
  • Given that firstHeading is partially provided by Parser. And banners are entirely parsed by the parser. Will the class name apply there? If yes, then the current page has the ability to affect the notices, which are supposed to be restricted. If not, then notices cannot use templates properly. Perhaps it makes sense to have mw-parser-output be a dynamically generated class instead of a fixed name (e.g. mw-parser-output-123), so that TemplateStyles can be restricted to the correct context only?
  • Besides elements provided by the parser (first heading, and site notices), is there a problem with moving the Translate and FlaggedRevs extension controls out of mw-body-content? Or more specifically, is there a problem with removing mw-body-content from #bodyContent in Vector and instead applying it to the relevant child nodes only?

It's not clear to me why a separate class is needed here. How is mw-parser-output semantically different from mw-body-content?

.mw-body-content includes a lot more than just the parser output, for example the edit form, the diff tables, and so on, and so wouldn't fix the complaints about those being affected by TemplateStyles.

Also .mw-body-content is added by the skin. Of the 42 skins in extensions/skins, it looks like only 12 have any mention of it.[1] Of the skins on enwiki, Monobook and Vector have it while CologneBlue and Modern don't.

[1]: Based on a grep for mw-body-content

I assume that mw-parser-output would solely be applied to output that literally comes from the parser, no exception.

That's the plan.

  1. Is it valid for MediaWiki core interface styles and skin styles to ever target mw-parser-output? I propose no, they must use mw-body-content instead.

I doubt there'd be reason for them to.

  1. Do we support outputting mw-parser-output on a MediaWiki-powered HTML response without being a descendant of mw-body-content? I propose no, an element with mw-parser-output must either also have mw-body-content or be a descendant of an element with that class.

Since 70% of skins in Gerrit don't even wrap the main content of the page in mw-body-content, yes, we support that. I have no objection to someone deciding to fix that somehow, but let's not let it block this.

This leaves the question: What is the risk of allowing TemplateStyles to target mw-body-content?

For affecting things it shouldn't affect, largely the same as it targeting #mw-content-text since in Monobook and Vector #mw-content-text is a child of .mw-body-content.

For not affecting things is should affect, pretty large since 70% of skins in Gerrit don't wrap the main page content in that class.

The answer, as I understand it, is that we have some interface components (like page heading, categories, notices/banners, Translate extension controls, FlaggedRevs extension controls) that should be styled the same way as page content (e.g. their headings and links etc.) but don't always come straight out of the parser. And presumably we want to avoid these elements from being styled by a template on the current page.

Yes, that seems accurate.

  • Would it be appropiate to instead have these common styles use mw-body instead of mw-body-content in their CSS? E.g. for skin headings, links etc. Thus turning mw-body-content into the proposed mw-parser-output.

Only if we kill mw-body-content from all skins and start adding it in core in the way the current patch adds mw-parser-output. I'd rather pick a new name than try to make major changes to an old one.

Again, mw-body is added by skins. It looks like 15 of 42 skins in Gerrit don't add it, but all four of the skins on enwiki do.[2]

[2]: Based on a grep for mw-body($|[^a-z-])

  • Given that firstHeading is partially provided by Parser. And banners are entirely parsed by the parser. Will the class name apply there?

Currently, it is not applied to firstHeading. I don't have banners on my local wiki to test, but if they use Parser::parse() rather than something like Message::parse() they would. Or whatever generates the banner could always add the wrapper itself if necessary.

If yes, then the current page has the ability to affect the notices, which are supposed to be restricted. If not, then notices cannot use templates properly. Perhaps it makes sense to have mw-parser-output be a dynamically generated class instead of a fixed name (e.g. mw-parser-output-123), so that TemplateStyles can be restricted to the correct context only?

Hmm, that's not a bad idea. OTOH, callers that didn't want the default mw-parser-output could always use ParserOptions->setWrapOutputClass() to set something else.

  • Besides elements provided by the parser (first heading, and site notices), is there a problem with moving the Translate and FlaggedRevs extension controls out of mw-body-content? Or more specifically, is there a problem with removing mw-body-content from #bodyContent in Vector and instead applying it to the relevant child nodes only?

Just that it would be a major redefinition of what exactly .mw-body-content applies to.

Change 350634 merged by jenkins-bot:
[mediawiki/core@master] Wrap parser output in <div class="mw-parser-output">

https://gerrit.wikimedia.org/r/350634

Anomie claimed this task.

Note this won't be included in 1.30.0-wmf.1, but should be in 1.30.0-wmf.2.

I don't know whether VisualEditor and StructuredDiscussions will get fixed up by that, though, or if they'll need some other work. I tried to set them up in mediawiki-vagrant, but neither actually functioned.

Now that the patch is merged and on Beta Labs, I was able to confirm that neither VE's editing area nor Flow's comments are getting wrapped in the new class. Filed T164790 and T164791 for those extensions.

Here's a late annoying question – should contents of page status indicators (<indicator name="…"></indicator> in wikitext; https://www.mediawiki.org/wiki/Help:Page_status_indicators) be wrapped in this new wrapper?

Good question. I'd say yes, since indicators are user-generated content. It'd probably be a good idea to add the class to the indicators' wrapper element.

Change 352711 had a related patch set uploaded (by Subramanya Sastry; owner: Subramanya Sastry):
[mediawiki/services/parsoid@master] Add mw-parser-output class to Parsoid's output

https://gerrit.wikimedia.org/r/352711

At svwiki we have

to convey information about protected pages, groups and blocked users in a reliable way, unlike templates that anyone can add or remove. This is not user-generated content, so if you're going to do this, I would prefer if you can just wrap the individual indicators that come from the wikitext, not the container for all indicators.

Change 352835 had a related patch set uploaded (by Anomie; owner: Anomie):
[mediawiki/core@master] Add mw-parser-output to indicators container

https://gerrit.wikimedia.org/r/352835

Change 352711 merged by jenkins-bot:
[mediawiki/services/parsoid@master] Add mw-parser-output class to Parsoid's output

https://gerrit.wikimedia.org/r/352711

Change 352835 abandoned by Anomie:
Add mw-parser-output to indicators container

Reason:
I'll let someone else figure out those details.

https://gerrit.wikimedia.org/r/352835

Can someone clarify for me if this is why ParserOptions::setWrapOutputClass() no longer takes 'false' as an option (this being the only sane way I could find to disable the entire wrapper when running parser->parse on a thing)?

I ask because I'm calling parser->parse on several things individually and we're just wrapping the final result in the relevant classes, and having each bit wrapped is causing other problems, but I can't figure out how to properly disable that now...

And if this is why, this is the relevant patch that this is causing problems for: https://gerrit.wikimedia.org/r/c/mediawiki/extensions/CollaborationKit/+/502902 Is there some other way I should be doing this?
Note that this isn't wikitext; merely contains arbitrary amounts of it.

Or if this is the wrong thing, can someone point me to the correct task(s)?

Can someone clarify for me if this is why ParserOptions::setWrapOutputClass() no longer takes 'false' as an option (this being the only sane way I could find to disable the entire wrapper when running parser->parse on a thing)?

"This" being what? This task is the reason that method exists in the first place. You can disable wrapping by using the clearWrapperDivClass() method or the wrapperDivClass option of the getText() method in ParserOutput.

The reason false was deprecated is T181846: Use post-cache transforms to remove `wrapclass` from the parser cache key (git blame is your friend...).

Thanks, that helps a lot. I wasn't even clear on what I should be looking for in the first place . and just found this task because it was mentioned in a comment somewhere possibly relevant...

Od1n updated the task description. (Show Details)