Page MenuHomePhabricator

Echo notification emails are sanitized strangely
Open, MediumPublic

Description

Things that need escaping seem to be getting removed entirely in talk page notification emails.

For instance, the email for https://www.mediawiki.org/w/index.php?title=User_talk:Emufarmers&diff=1037143&oldid=1037136 consisted of:
Jack Phoenix left a message on your talk page in "|".
So I heard that orange is your favorite color, especially when wrapped in a nice little that has class="usermessage"... --


Version: unspecified
Severity: normal

Details

Reference
bz66630

Related Objects

Event Timeline

bzimport raised the priority of this task from to Needs Triage.Nov 22 2014, 3:12 AM
bzimport added a project: Notifications.
bzimport set Reference to bz66630.
bzimport added a subscriber: Unknown Object (MLST).

Can you upload or forward me the raw email?

Notification email

Attached:

I believe Echo has its own parser. It looks like there are two distinct issues here:

  • in the message body text, "<code>&lt;div&gt;</code>" is getting stripped for some reason
  • the parser is eating the ":" in the ":|" subject line

It probably makes sense to add parser tests for these issues and then fix them. Tentatively adding good first task.

MZMcBride renamed this task from Notification emails are sanitized strangely to Echo notification emails are sanitized strangely.Dec 8 2014, 5:26 AM
MZMcBride set Security to None.
MZMcBride added a subscriber: Emufarmers.

Checked in betalabs with the original text.

the parser is eating the ":" in the ":|" subject line

It's still true.

Apart from not dispalying ":" in a title, Echo notification and email look OK to me. @SBisson can you confirm?

So, the following text

== :| ==

Test! —[[User:Emufarmers|Emufarmers]]<sup>([[User talk:Emufarmers|T]]|
[[Special:Contributions/Emufarmers|C]])</sup> 23:39, 14 June 2014 (UTC)

:So I heard that orange is your favorite color, especially when wrapped in a 
nice little <code>&lt;div&gt;</code> that has <code>class="usermessage"
</code>... --[[User:Jack Phoenix|Jack Phoenix]] <sub>([[User talk:Jack 
Phoenix|Contact]])</sub> 23:54, 14 June 2014 (UTC)

will result in

Screen Shot 2016-05-20 at 4.21.20 PM.png (357×1 px, 73 KB)

The email received (user names were modified) :

Screen Shot 2016-05-20 at 4.19.59 PM.png (385×779 px, 53 KB)

The echo notification came out as:

Screen Shot 2016-05-20 at 4.20.27 PM.png (180×575 px, 34 KB)

I think :| is the text of the header. The colon shouldn't be interpreted as indentation and removed.

The problem is probably with EchoDiscussionParser.

I think :| is the text of the header. The colon shouldn't be interpreted as indentation and removed.

The problem is probably with EchoDiscussionParser.

I think it's because we parse the section title in isolation, and some syntax like : only works at the start of a line.

jmatazzoni claimed this task.
jmatazzoni subscribed.

so are we OK wi

Are we OK with the anomalies described? They all sound pretty edge-case. Can I leave this closed?

Are we OK with the anomalies described? They all sound pretty edge-case. Can I leave this closed?

Yeah, it's probably fine.

Huh? Unless this issue has been fixed, please leave it open.

Huh? Unless this issue has been fixed, please leave it open.

Sorry, I thought this was part of a larger bug, but instead this task is just about == :| ==.

This bug happens because the section title is extracted and parsed on its own. This means the : is now at the beginning of the line, which gives it special meaning (indent). You'll probably see similar issues with == *foo == because * at the beginning of a line starts a bullet list.

I don't know what in MW core we could possibly use to work around this; perhaps we could call the code that generates section links in edit summaries?