Images: parse caption separately all the way to DOM and use DOMFragment encapsulation
Closed, ResolvedPublic
Actions

Assigned To

Authored By

	Inez
	May 30 2013, 12:40 AM

Description

Image tag produced with this wikitext: [[File:Wiki.png|caption<b>123</b>]] should have alt attribute with value "caption123" set.

It would be great for VE team if we could actually get HTML DOM of caption (instead of text with stripped out HTML tags) because then we would be able to provide nice experience when converting from inline image to block image and other way round.

Version: unspecified
Severity: normal

Details

Reference: bz48958

Related Objects
Search...

Status	Assigned	Task
Resolved	cscott	T56844 Image / media handling (tracking)
Open	None	T68808 Making an image Frameless with out any wrap option deletes the content of the caption
Resolved	cscott	T50958 Images: parse caption separately all the way to DOM and use DOMFragment encapsulation

Event Timeline

• bzimport raised the priority of this task from to Medium.Nov 22 2014, 1:41 AM

• bzimport added a project: Parsoid.

• bzimport set Reference to bz48958.

Inez created this task.May 30 2013, 12:40 AM

Our plan so far has been to only set the alt if it was explicitly specified using the alt=foo option. The caption HTML DOM will be in the data-mw.caption member. This lets VE set the alt and caption separately, while client-side or server-side postprocessing can set the default alt to the textValue of the caption when rendering for viewing.

In the current implementation parsing the caption to DOM is completely missing for inline images, and only implemented in an implicit way for block-level images (by returning tokens for the figcaption content inline).

We should use a full pipeline to parse captions all the way to DOM, and then use the internal DOMFragment mechanism to preserve these fragments through token transformations. They are then unpacked at the end of DOMPostProcessor processing. As a side effect this will also properly enforce nesting of captions without the hackish closeUnclosedBlockTags helper.

Sounds reasonable to me. Now that we have implemented properly-nested-DOM requirement on some wikitext constructs, we should be able to use that for image captions as well.

[Parsoid component reorg by merging JS/General and General. See bug 50685 for more information. Filter bugmail on this comment. parsoidreorg20130704]

On a related note, we should *not* set an alt attribute for read-only viewing that just contains an image's file name. The alt attribute should only contain proper alternate descriptions of the image that would be useful for users with screen readers.

I think this is a dup of bug 52567, and should probably be resolved as such.

Bug 52567 has been marked as a duplicate of this bug. ***

Some notes re accessibility from a recent meeting with Gerardo Capiel:

Long-term we should try to store the alt text or long description along with the image itself, and use that to populate the alt attribute if none was provided explicitly. This requires significant work in core to store metadata along with image pages.

For screenreaders it would be good to also provide a longdesc attribute linking to the textual description on the image page.

Open a separate bug for the longdesc issue?

Focusing this bug on the "parse caption to DOM" issue -- test case is:

[[File:Wiki.png|caption<b>123]]

which should really have the </b> in the data-mw attribute.

https://bugzilla.wikimedia.org/show_bug.cgi?id=61566 is the longdesc bug

This appears to have regressed. The data-mw.caption value is wikitext again. It should be parsoid DOM.

Change 145029 had a related patch set uploaded by Cscott:
WIP: Parse caption to DOM using recursive wikitext parse.

https://gerrit.wikimedia.org/r/145029

ssastry moved this task from Needs Triage to Needs Discussion on the Parsoid board.Dec 9 2014, 5:16 PM

ssastry moved this task from Needs Discussion to In Progress on the Parsoid board.Feb 2 2015, 3:22 PM

cscott mentioned this in T75575: Images: Supply the (undisplayed) captions of inline images in the DOM, not just as wikitext.Apr 2 2015, 5:13 PM

ssastry merged a task: T75575: Images: Supply the (undisplayed) captions of inline images in the DOM, not just as wikitext.May 29 2015, 9:22 PM

ssastry added subscribers: Jdforrester-WMF, • Elitre, Arlolra.

ssastry mentioned this in T93580: Reference inside file caption isn't detected.May 29 2015, 9:24 PM

Ricordisamoa subscribed.Jul 24 2015, 5:37 AM

Restricted Application added a subscriber: Aklapper. · View Herald TranscriptJul 24 2015, 5:37 AM

cscott added a parent task: T68808: Making an image Frameless with out any wrap option deletes the content of the caption.Jul 30 2015, 8:11 PM

cscott mentioned this in T68808: Making an image Frameless with out any wrap option deletes the content of the caption.

cscott mentioned this in rGPAR66242606a642: Parse non-block image caption all the way to Parsoid DOM..Aug 4 2015, 10:45 PM

ssastry closed this task as Resolved.Aug 10 2015, 4:48 PM

ssastry removed a project: Patch-For-Review.

ssastry set Security to None.

Arlolra mentioned this in T108380: Images used in flow without the |thumb parameter don't show captions as tooltips.Mar 30 2017, 6:54 PM

Arlolra mentioned this in T162360: Differences in Parsoid HTML compared to PHP parser output breaks javascript that is tailored ot the PHP parser output.Apr 13 2017, 12:40 PM

Kelson subscribed.Apr 13 2017, 12:58 PM

Arlolra mentioned this in T297443: Add title/alt attributes from caption to media where caption isn't visible.Dec 9 2021, 10:49 PM

Ciencia_Al_Poder mentioned this in T63566: Images: Record alt/longdesc attributes along with the image.Jan 17 2022, 5:16 PM

Images: parse caption separately all the way to DOM and use DOMFragment encapsulationClosed, ResolvedPublicActions

Description

Details

Related ObjectsSearch...

Event Timeline

Images: parse caption separately all the way to DOM and use DOMFragment encapsulation
Closed, ResolvedPublic
Actions

Related Objects
Search...