Page MenuHomePhabricator

[DO NOT USE] HTML Tidy issues (tracking) [superseded by the #Tidy tag]
Closed, InvalidPublic

Description

Tidy won't just leave line endings alone; if you disable wrapping it just stacks
lines as long as they can go.

That's REALLY annoying, and breaks playing with white-space setting.


Version: unspecified
Severity: normal
See Also:
T68516: HTML Tags are stripped from extension output

Details

Reference
bz2542

Event Timeline

bzimport raised the priority of this task from to Low.Nov 21 2014, 8:36 PM
bzimport set Reference to bz2542.
bzimport added a subscriber: Unknown Object (MLST).

plugwash wrote:

i can't help but think that using tidy is a dirty hack and really mediawiki
should be able to generate valid xhtml without it.

Yes, it is a dirty hack.

MediaWiki _mostly_ produces valid XHTML output by itself, but the fixing of badly nested
tags etc is fairly sucky, and we're letting tidy do it for now until that gets rewritten to
work fully correctly.

Product: From extensions to core (enabled through $wg*Tidy*).

happy.melon.wiki wrote:

Stealing this for a Tidy tracking bug... there are certainly plenty of issues to track!! :D

Danny_B renamed this task from HTML Tidy issues (tracking) to [DO NOT USE] HTML Tidy issues (tracking) [superseded by the #Tidy tag].May 28 2016, 10:42 AM
Danny_B closed this task as Invalid.
Danny_B lowered the priority of this task from Low to Lowest.
Danny_B edited projects, added Tidy; removed MediaWiki-Parser.
Danny_B removed a subscriber: wikibugs-l-list.
Danny_B removed subtasks: T87414: Graph placeholder is stripped from page by Tidy, T122787: Incorrect end tags for <source> and <track>, T33871: Fix usage of tidy to work cleanly with html5, T59196: Unclosed strikethrough tag makes strikethrough formatting continue throughout page, T89331: Replace HTML4 Tidy in MW parser with an equivalent HTML5 based tool, T69007: <table> in fosterable position in another <table> - tidy (html4) vs html5 treebuilding differences, T32597: <section> tag name collides with HTML5 <section> tag, T85801: Blank HTML body: "Tidy found serious XHTML errors", T76100: Thumbnails in table caption get hoisted outside the table, T74833: MWTidy: Old versions of Tidy break flow content out of <caption> and <table>, T67747: Sanitizer::removeHTMLtags doesn't close tags correctly when $wgUseTidy is enabled, T73962: When Tidy is enabled, wikitext like [[Link|<div>Text</div>]] will not actually produce a clickable link, T68516: HTML Tags are stripped from extension output, T56617: Replace Tidy with a library that doesn't suck, T56616: <center> embedded into headings results in two heading tags, T49673: Empty list items goes missing in the PHP parser when tidy is enabled, T50079: Tidy: Numbered list markup with <pre> inbetween creates empty first bullet after <pre> section, T41525: Update tidy to a version that knows the latest HTML5 tricks, T41339: Space between words not shown in an article, T31252: Consecutive DIV tags insert unwanted newline, T40800: Tidy should not create (additional) white space between elements, T30544: HTML end tag displacement in list items with external links., T25467: sig preview not accurate, disregards html-tidy effects, T32052: Imbricated tags and characters put right next, T22050: Reference with numbered list triggers Tidy to corrupt reference list, T28516: &nbsp; causes HTML tags to close unexpectedly (when using tidy), T29889: Tidy fails silently if missing, documentation out of date, T28212: Nested tags without space before them don't work, T29786: let empty metadata-elements pass through tidy, T22829: Unordered list after <center>ed heading contains additional <li>s, T27939: <references/> in ordered list or unordered list closes surrounding element, T22714: HTML tidy doesn't work for 404 pages like it does for existing pages (doesn't add missing closing tags anymore e.g.), T20994: When template evaluated, extraneous line break introduced, T14221: Without tidy: wrongly closed table cell messes up display completely, T20851: HTMLtidy moves whitespace around, T10948: Parser: HTML table syntax (e.g. <td>) should not be parsed when inside <pre>, T11996: Multiline tags in lists should be output more intelligently, T5276: Give image <gallery>s fluid width, T8265: Sub tag not closed if it contains an unmatched '', T11737: Tidy sometimes goes ballistic when a block-level element is wrapped in an inline element, T11661: Tidy shouldn't reopen inline tags after the end of a block, T11538: Unclosed tags go across image thumbs, T15073: ref might add a newline, T7859: Rendering problem with empty list item and next section, T3955: There is a newline within <a> tag attribute values.May 28 2016, 10:50 AM
Danny_B moved this task from Tag to Transition completed / Archived on the Tracking-Neverending board.

I hope I'm not in the wrong room or asking a question that nobody knows the answer to....DOES ANYBODY KNOW HOW TO CLEAN THIS unTIDY header tag out of my html ? I would prefer a cli script to do it recursively since I accidentally trusted the stupid untidy program under the assumption that it would simply double-check my work....instead it crapped on it and then had the audacity to inject a link in the header back to its site with a snippet of strange code I'd rather not have.

There are over 5 T H O U S A N D pages I need to remove this bad unethical injection from (in fact it's way more if you are counting the online bible I host).

This is a shame....backup, I tell my customers that all the time, backup before you mess with corporate crap...uggghhh.....dammit man.

I am afraid this is the wrong room. I have no idea what this "untidy" program that you mention is. Try asking at https://www.mediawiki.org/wiki/Support for help with maintaining MediaWiki wikis.