Page MenuHomePhabricator

Edit form eats leading newlines on save/preview
Closed, ResolvedPublic

Description

Author: thomasV1

Description:
procedure : create an article with a few heading CRs.

Each time the page is edited, MediaWiki edit eats one CR.
The same behaviour is observed with 'preview'.

this behaviour is also reported here :
http://meta.wikimedia.org/wiki/Help:Newlines_and_spaces


Version: 1.11.x
Severity: normal

Details

Reference
bz12130

Event Timeline

bzimport raised the priority of this task from to Unbreak Now!.Nov 21 2014, 9:56 PM
bzimport set Reference to bz12130.

What is a "heading CR"? Do you mean a blank line at the start of the article?

(In reply to comment #1)

What is a "heading CR"? Do you mean a blank line at the start of the article?

Yes. One of the blank lines at the start of an article will disappear every time Save or Preview is clicked.

On a quick examination, all line-breaks seem to be making it into the HTML output, but at least some browsers are stripping the first line break.

(Confirmed Firefox 2.0.0.11 and Safari 3.0.4 on Mac OS X 10.4.11.)

I did a quick test replacing the raw \n chars with escaped 
 values, but found no change in behavior with these browsers. Kind of... weeeeeeird. :(

Created attachment 4414
Test patch to EditPage.php slapping in some escaped chars

Attached:

thomasV1 wrote:

I just tried your patch with firefox, but I did not notice any change...

We already have START, why don't we just strip all empty lines from the start of the page on our own and have people use START when they need whitespace at the beginning.

That should even it out so nothing breaks.

Created attachment 5105
testcase: textarea with two leading blank lines

This test case has three CR+LF ("0d 0a 0d 0a 0d 0a") resulting in two leading blank lines before the text. Firefox 3 is encoding the three CR+LF as "%0A%0A"

Attached:

Updated summary to make the bug easier to search for.

thomasV1 wrote:

*** Bug 26028 has been marked as a duplicate of this bug. ***

  • Bug 26028 has been marked as a duplicate of this bug. ***

*Bulk BZ Change: +Patch to open bugs with patches attached that are missing the keyword*

(In reply to comment #11)

  • Bug 26028 has been marked as a duplicate of this bug. ***

FIY: For an example of how this affects Wikisources, making it necessary to use a hack[1] on various pages, see
[[Wikisource:Scriptorium/Archives/2010-12#Why_line_breaks_are_removed.3F]]
(and bug 26028 comment 0).

[1]https://secure.wikimedia.org/wikipedia/sources/wiki/Wikisource_talk:ProofreadPage#ProofreadPage_eats_.5Cn

As a confirmation:

This bug is still happening on MediaWiki 1.18. Steps to reproduce:

  • Edit or create a wiki page and insert 10 lines on top of the page.
  • Save the page
  • Any of these:

*1) Go to edit mode, save the page.
*2) Go to edit mode, change a word, save the page
*3) Go to edit mode, change word, preview, preview, preview, save the page

Expected result:

  1. nothing happends, null edit, no revision
  2. One word changed
  3. One word changed

Actual result:

  1. An edit removing the first line is saved
  2. An edit removing the first line and changing a word is saved
  3. An edit changing a word and removing the first 4 lines (every hit on preview chopped one more line)

Proposed change:
Trim leading whitespace in the parser before saving, just like with trailing whitespace. Never to be saved to the database. This means on existing pages it will be trimmed in 1 edit alltogether.

Alternative solution (if there's a usecase for having leading whitespace. Not sure if there is, and if there isn't, might as well strip it. However templates might want leading whitespace in certain circumstances maybe - not sure):

I'm pretty sure this behaviour comes about because if someone writes in html something like
<textarea>
foo
</textarea>
They'd assume the first newline is not part of the textarea (because its prettier to write tags on their own line). Solution would be prefix with an extra newline if the textarea's value starts with a newline.

(Changing title since CR makes me thing code review instead of carriage return. Also the cool OS's don't use a CR in a newline anyhow ;)

(In reply to comment #15)

.. (if there's a usecase for having leading whitespace. Not
sure if there is, and if there isn't, might as well strip it. However templates
might want leading whitespace in certain circumstances maybe - not sure)..

Wikisource proofreading currently depends on leading whitespace. Text from two pages are merged into one paragraph if there is no blank line separating them.


case 1:

page 1:
... This
is a sentence in the mid-

page 2: dle of a parapgraph and it must
be merged together.

should result in output of

<p>... This is a sentence in the middle of a parapgraph and it must be merged together. ... </p>


case 2:

page 1:
.. This is the final sentence
in a paragraph.

page 2:

This is the first sentence in a
new paragraph. ...

should result in

<p>... This is the final sentence in a paragraph.</p>

<p>This is the first sentence in a new paragraph. ...</p>

Adding need-unittest keyword and assigning myself; I think we can actually use a JS qunit test to confirm the browser behavior; if it's consistent on all platforms then we can tweak how we output textareas to force an initial (always ignored) newline, leaving the other newlines to fend for themselves.

r98576 adds test cases trying several different numbers of initial newlines as set both via HTML parsing (where it does strip) and direct DOM manipulation (where it doesn't).

I have the vague recollection that in XHTML mode things usually don't strip... but we don't use that actively as it tends to break in so many browsers. :)

Should be fixed by r98578 on trunk: prepends an extra newline to output in Html::textarea() if the $value has an initial newline.

However...... ProofreadPage stuff looks like it might be a totally separate thing (bug 26028) since the JS does a lot of manipulation.

Merged to REL1_18 in r98584; still needs merging to 1.18wmf1 for deployment to wikis that have been upgraded to 1.18. 1.17 wikis will be upgraded soon enough. :)

We've found this behavior is in fact spec'd in HTML 5 draft spec (and wouldn't surprise me if it's hiding in the HTML 4 as well somewhere)

http://www.whatwg.org/specs/web-apps/current-work/multipage/syntax.html#element-restrictions