Page MenuHomePhabricator

Plaintext extracts could append '*' and '#' for lists
Open, LowPublicFeature

Description

Currently, list items are simply converted to lines in plain-text extracts. Bullet lists should use *, while numbered lists should use localisable format (by default "%d.").

Source page wikitext:

This is a bullet list.
* One
* Two

This is a numbered list.
# One
# Two
# Three

Current output for plaintext extracts:

This is a bullet list.
One
Two

This is a numbered list.
One
Two
Three

Desired output:

This is a bullet list.
* One
* Two

This is a numbered list.
1. One
2. Two
3. Three

Version: master
Severity: enhancement

Details

Reference
bz57850

Event Timeline

bzimport raised the priority of this task from to Needs Triage.Nov 22 2014, 2:37 AM
bzimport added a project: TextExtracts.
bzimport set Reference to bz57850.
bzimport added a subscriber: Unknown Object (MLST).

bingle-admin wrote:

Prioritization and scheduling of this bug is tracked on Mingle card https://wikimedia.mingle.thoughtworks.com/projects/mobile/cards/1472

Can we have a sample api request and sample text to make this task easier to follow?

Source page wikitext:

This is a bullet list.
* One
* Two

This is a numbered list.
# One
# Two
# Three

Current output for plaintext extracts:

This is a bullet list.
One
Two

This is a numbered list.
One
Two
Three

Desired output:

This is a bullet list.
* One
* Two

This is a numbered list.
1. One
2. Two
3. Three
Jdlrobson set Security to None.

Thanks MaxSem. Is there any applications that surface this? For things like Hovercards I'm not sure lists should even show up in extracts.

Hovercards want so much different stuff that it feels like it should get its own parser. However, TE is a generic text extraction extension hevily used by third parties so it should extract everything in a way that makes more sense for general audience.

Agreed. Given all the bugs I'm seeing I'm not sure if TextExtracts is a good fit for Hovercards...

^ This bug doesn't strictly impact plain text previews as, like @Jdlrobson said above, lists shouldn't show up.

Jdlrobson renamed this task from Extracts should handle lists properly to Plaintext extracts could append '*' and '#' for lists.Jul 13 2017, 6:58 PM
Aklapper changed the subtype of this task from "Task" to "Feature Request".Feb 4 2022, 11:13 AM
Aklapper removed subscribers: Tfinc, wikibugs-l-list.