Page MenuHomePhabricator

Flow: external links garble wiki text
Closed, ResolvedPublic

Description

I was trying to add some links to the header of Talk:Flow on ee-flow labs instance, and the wiki text kept getting garbled -- letters would appear in the wrong place, links would get lost. You can see this if you click the pencil to edit it (Don't save, we need that text!).

I pasted similar text into a post and that appeared garbled, see http://ee-flow.wmflabs.org/w/index.php?title=User_talk:Spage&workflow=0508e9b63c8f34dffb6efa163e68c4ac. It has the wiki text

On edit this turns into

report bug in [" test wiki for the [[mw:Flow_Portal|Flow pro MediaWiki's Bugzilla]

This is not only garbling text, it's also introducing random bits of text that resembles my other recent edits. Maybe a caching issue?


Version: unspecified
Severity: major
See Also:
https://bugzilla.wikimedia.org/show_bug.cgi?id=54892

Details

Reference
bz56820

Event Timeline

bzimport raised the priority of this task from to High.Nov 22 2014, 2:23 AM
bzimport added a project: Parsoid-Web-API.
bzimport set Reference to bz56820.

The WMF core features team tracks this bug on Mingle card https://mingle.corp.wikimedia.org/projects/flow/cards/444, but people from the community are welcome to contribute here and in Gerrit.

Confirmed:

Input: [http://en.wikipedia.org hello]

When attempting to re-edit this same input: [his is a "labs" test wi hello]

Yowza.

Reproducible (see below) using ParsoidUtils, so I think it's safe to say that this is a problem either with Parsoid or the way we're calling it.

At first blush, it looks like Parsoid is retrieving old revisions and mangling their text with whatever it's asked to produce when it's given HTML and asked to turn it into wikitext.

Minimal test case on my local box:

$wikitext = '[http://en.wikipedia.org/ An external link]';

print $html = Flow\ParsoidUtils::convert( 'wikitext', 'html', $wikitext )

<p data-parsoid='{"dsr":[0,57,0,0]}'><a rel="mw:ExtLink" href="http://en.wikipedia.org/" data-parsoid="{&quot;targetOff&quot;:26,&quot;contentOffsets&quot;:[26,42],&quot;a&quot;:{&quot;href&quot;:&quot;http://en.wikipedia.org/&quot;},&quot;sa&quot;:{&quot;href&quot;:&quot;his is Andrew's test wik&quot;},&quot;dsr&quot;:[0,57,26,1]}">An external link</a></p>

print Flow\ParsoidUtils::convert( 'html', 'wikitext', $html )

[his is Andrew's test wik An external link]

And on ee-flow:

$wikitext = '[http://en.wikipedia.org/ An external link]';

print Flow\ParsoidUtils::convert( 'wikitext', 'html', $wikitext )

<p data-parsoid='{"dsr":[0,318,0,0]}'><a rel="mw:ExtLink" href="http://en.wikipedia.org/" data-parsoid='{"targetOff":26,"contentOffsets":[26,42],"a":{"href":"http://en.wikipedia.org/"},"sa":{"href":"his is a \"labs\" test wik"},"dsr":[0,318,26,1]}'>An external link</a></p>

print $html = Flow\ParsoidUtils::convert( 'wikitext', 'html', $wikitext )

<p data-parsoid='{"dsr":[0,318,0,0]}'><a rel="mw:ExtLink" href="http://en.wikipedia.org/" data-parsoid='{"targetOff":26,"contentOffsets":[26,42],"a":{"href":"http://en.wikipedia.org/"},"sa":{"href":"his is a \"labs\" test wik"},"dsr":[0,318,26,1]}'>An external link</a></p>

print Flow\ParsoidUtils::convert( 'html', 'wikitext', $html )

[his is a "labs" test wik An external link]

Here's the debug log from that, if it helps:

starting parsing of localhost:Main_Page
{

"0": "AsyncTokenTransformManager.setFrame",
"1": null,
"2": []

}
T:html: {"type":"TagTk","name":"body","attribs":[],"dataAttribs":{}}
inserting shadow meta for body
{

"0": "Retrieved Main_Page",
"1": {
  "title": "Main Page",
  "id": 1,
  "ns": 0,
  "revision": {
    "revid": 130,
    "parentid": 107,
    "user": "Andrew",
    "userid": 1,
    "timestamp": "2013-08-06T18:31:26Z",
    "size": 57,
    "sha1": "4b1e4754b61a7d522b4991a4b6f50b64ec0f5ff6",
    "contentmodel": "wikitext",
    "comment": "Added content",
    "contentformat": "text/x-wiki",
    "*": "This is Andrew's test wiki, used for going with the Flow."
  }
}

}
T:html: {"type":"TagTk","name":"body","attribs":[],"dataAttribs":{}}
inserting shadow meta for body
TOKS: [{"type":"SelfclosingTagTk","name":"extlink","attribs":[{"k":"href","v":"http://en.wikipedia.org/"},{"k":"mw:content","v":["An external link"]},{"k":"spaces","v":" "}],"dataAttribs":{"targetOff":26,"tsr":[0,43],"contentOffsets":[26,42]}}]
{

"0": "SyncTokenTransformManager.onChunk, input: ",
"1": [
  {
    "type": "SelfclosingTagTk",
    "name": "extlink",
    "attribs": [
      {
        "k": "href",
        "v": "http://en.wikipedia.org/"
      },
      {
        "k": "mw:content",
        "v": [
          "An external link"
        ]
      },
      {
        "k": "spaces",
        "v": " "
      }
    ],
    "dataAttribs": {
      "targetOff": 26,
      "tsr": [
        0,
        43
      ],
      "contentOffsets": [
        26,
        42
      ]
    }
  }
]

}
{

"0": "SyncTokenTransformManager.onChunk: emitting ",
"1": [
  {
    "type": "SelfclosingTagTk",
    "name": "extlink",
    "attribs": [
      {
        "k": "href",
        "v": "http://en.wikipedia.org/"
      },
      {
        "k": "mw:content",
        "v": [
          "An external link"
        ]
      },
      {
        "k": "spaces",
        "v": " "
      }
    ],
    "dataAttribs": {
      "targetOff": 26,
      "tsr": [
        0,
        43
      ],
      "contentOffsets": [
        26,
        42
      ]
    }
  }
]

}
A2-61: {"type":"SelfclosingTagTk","name":"extlink","attribs":[{"k":"href","v":"http://en.wikipedia.org/"},{"k":"mw:content","v":["An external link"]},{"k":"spaces","v":" "}],"dataAttribs":{"targetOff":26,"tsr":[0,43],"contentOffsets":[26,42]}}
{

"0": "maybeSyncReturn transforming",
"1": "c-172",
"2": {
  "async": true,
  "tokens": []
}

}
{

"0": "AttributeExpander._returnAttributes: ",
"1": [
  {
    "k": "href",
    "v": "http://en.wikipedia.org/"
  },
  {
    "k": "mw:content",
    "v": [
      "An external link"
    ]
  },
  {
    "k": "spaces",
    "v": " "
  }
]

}
{

"0": "maybeSyncReturn transforming",
"1": "c-172",
"2": {
  "tokens": [
    {
      "type": "SelfclosingTagTk",
      "name": "extlink",
      "attribs": [
        {
          "k": "href",
          "v": "http://en.wikipedia.org/"
        },
        {
          "k": "mw:content",
          "v": [
            "An external link"
          ]
        },
        {
          "k": "spaces",
          "v": " "
        }
      ],
      "dataAttribs": {
        "targetOff": 26,
        "tsr": [
          0,
          43
        ],
        "contentOffsets": [
          26,
          42
        ]
      }
    }
  ]
}

}
{

"0": "workStack",
"1": "c-172",
"2": 1.12,
"3": [
  [],
  [
    {
      "type": "SelfclosingTagTk",
      "name": "extlink",
      "attribs": [
        {
          "k": "href",
          "v": "http://en.wikipedia.org/"
        },
        {
          "k": "mw:content",
          "v": [
            "An external link"
          ]
        },
        {
          "k": "spaces",
          "v": " "
        }
      ],
      "dataAttribs": {
        "targetOff": 26,
        "tsr": [
          0,
          43
        ],
        "contentOffsets": [
          26,
          42
        ]
      }
    }
  ]
]

}
A2-61: {"type":"SelfclosingTagTk","name":"extlink","attribs":[{"k":"href","v":"http://en.wikipedia.org/"},{"k":"mw:content","v":["An external link"]},{"k":"spaces","v":" "}],"dataAttribs":{"targetOff":26,"tsr":[0,43],"contentOffsets":[26,42]}}
{

"0": "maybeSyncReturn transforming",
"1": "c-172",
"2": {
  "tokens": [
    {
      "type": "TagTk",
      "name": "a",
      "attribs": [
        {
          "k": "rel",
          "v": "mw:ExtLink"
        },
        {
          "k": "href",
          "v": "http://en.wikipedia.org/"
        }
      ],
      "dataAttribs": {
        "targetOff": 26,
        "tsr": [
          0,
          43
        ],
        "contentOffsets": [
          26,
          42
        ],
        "a": {
          "href": "http://en.wikipedia.org/"
        },
        "sa": {
          "href": "his is Andrew's test wik"
        }
      }
    },
    {
      "type": "SelfclosingTagTk",
      "name": "mw:dom-fragment-token",
      "attribs": [
        {
          "k": "contextTok",
          "v": {
            "type": "SelfclosingTagTk",
            "name": "extlink",
            "attribs": [
              {
                "k": "href",
                "v": "http://en.wikipedia.org/"
              },
              {
                "k": "mw:content",
                "v": [
                  "An external link"
                ]
              },
              {
                "k": "spaces",
                "v": " "
              }
            ],
            "dataAttribs": {
              "targetOff": 26,
              "tsr": [
                0,
                43
              ],
              "contentOffsets": [
                26,
                42
              ]
            }
          }
        },
        {
          "k": "content",
          "v": [
            "An external link"
          ]
        },
        {
          "k": "noPre",
          "v": true
        },
        {
          "k": "srcOffsets",
          "v": [
            26,
            42
          ]
        }
      ],
      "dataAttribs": {}
    },
    {
      "type": "EndTagTk",
      "name": "a",
      "attribs": [],
      "dataAttribs": {}
    }
  ]
}

}
{

"0": "workStack",
"1": "c-172",
"2": 1.149,
"3": [
  [],
  [],
  [
    {
      "type": "TagTk",
      "name": "a",
      "attribs": [
        {
          "k": "rel",
          "v": "mw:ExtLink"
        },
        {
          "k": "href",
          "v": "http://en.wikipedia.org/"
        }
      ],
      "dataAttribs": {
        "targetOff": 26,
        "tsr": [
          0,
          43
        ],
        "contentOffsets": [
          26,
          42
        ],
        "a": {
          "href": "http://en.wikipedia.org/"
        },
        "sa": {
          "href": "his is Andrew's test wik"
        }
      }
    },
    {
      "type": "SelfclosingTagTk",
      "name": "mw:dom-fragment-token",
      "attribs": [
        {
          "k": "contextTok",
          "v": {
            "type": "SelfclosingTagTk",
            "name": "extlink",
            "attribs": [
              {
                "k": "href",
                "v": "http://en.wikipedia.org/"
              },
              {
                "k": "mw:content",
                "v": [
                  "An external link"
                ]
              },
              {
                "k": "spaces",
                "v": " "
              }
            ],
            "dataAttribs": {
              "targetOff": 26,
              "tsr": [
                0,
                43
              ],
              "contentOffsets": [
                26,
                42
              ]
            }
          }
        },
        {
          "k": "content",
          "v": [
            "An external link"
          ]
        },
        {
          "k": "noPre",
          "v": true
        },
        {
          "k": "srcOffsets",
          "v": [
            26,
            42
          ]
        }
      ],
      "dataAttribs": {}
    },
    {
      "type": "EndTagTk",
      "name": "a",
      "attribs": [],
      "dataAttribs": {}
    }
  ]
]

}
A2-61: {"type":"TagTk","name":"a","attribs":[{"k":"rel","v":"mw:ExtLink"},{"k":"href","v":"http://en.wikipedia.org/"}],"dataAttribs":{"targetOff":26,"tsr":[0,43],"contentOffsets":[26,42],"a":{"href":"http://en.wikipedia.org/"},"sa":{"href":"his is Andrew's test wik"}}}
A2-61: {"type":"SelfclosingTagTk","name":"mw:dom-fragment-token","attribs":[{"k":"contextTok","v":{"type":"SelfclosingTagTk","name":"extlink","attribs":[{"k":"href","v":"http://en.wikipedia.org/"},{"k":"mw:content","v":["An external link"]},{"k":"spaces","v":" "}],"dataAttribs":{"targetOff":26,"tsr":[0,43],"contentOffsets":[26,42]}}},{"k":"content","v":["An external link"]},{"k":"noPre","v":true},{"k":"srcOffsets","v":[26,42]}],"dataAttribs":{}}
{

"0": "maybeSyncReturn transforming",
"1": "c-172",
"2": {
  "tokens": [
    "An external link"
  ],
  "async": false
}

}
A2-61: {"type":"EndTagTk","name":"a","attribs":[],"dataAttribs":{}}
{

"0": "firstAccum",
"1": "sync",
"2": "c-172",
"3": [
  {
    "type": "TagTk",
    "name": "a",
    "attribs": [
      {
        "k": "rel",
        "v": "mw:ExtLink"
      },
      {
        "k": "href",
        "v": "http://en.wikipedia.org/"
      }
    ],
    "dataAttribs": {
      "targetOff": 26,
      "tsr": [
        0,
        43
      ],
      "contentOffsets": [
        26,
        42
      ],
      "a": {
        "href": "http://en.wikipedia.org/"
      },
      "sa": {
        "href": "his is Andrew's test wik"
      }
    }
  },
  "An external link",
  {
    "type": "EndTagTk",
    "name": "a",
    "attribs": [],
    "dataAttribs": {}
  }
]

}
{

"0": "AsyncTokenTransformManager onChunk",
"1": "sync",
"2": [
  {
    "type": "TagTk",
    "name": "a",
    "attribs": [
      {
        "k": "rel",
        "v": "mw:ExtLink"
      },
      {
        "k": "href",
        "v": "http://en.wikipedia.org/"
      }
    ],
    "dataAttribs": {
      "targetOff": 26,
      "tsr": [
        0,
        43
      ],
      "contentOffsets": [
        26,
        42
      ],
      "a": {
        "href": "http://en.wikipedia.org/"
      },
      "sa": {
        "href": "his is Andrew's test wik"
      }
    }
  },
  "An external link",
  {
    "type": "EndTagTk",
    "name": "a",
    "attribs": [],
    "dataAttribs": {}
  }
]

}
emitting
{

"0": "SyncTokenTransformManager.onChunk, input: ",
"1": [
  {
    "type": "TagTk",
    "name": "a",
    "attribs": [
      {
        "k": "rel",
        "v": "mw:ExtLink"
      },
      {
        "k": "href",
        "v": "http://en.wikipedia.org/"
      }
    ],
    "dataAttribs": {
      "targetOff": 26,
      "tsr": [
        0,
        43
      ],
      "contentOffsets": [
        26,
        42
      ],
      "a": {
        "href": "http://en.wikipedia.org/"
      },
      "sa": {
        "href": "his is Andrew's test wik"
      }
    }
  },
  "An external link",
  {
    "type": "EndTagTk",
    "name": "a",
    "attribs": [],
    "dataAttribs": {}
  }
]

}

T:pre:any: sol : {"type":"TagTk","name":"a","attribs":[{"k":"rel","v":"mw:ExtLink"},{"k":"href","v":"http://en.wikipedia.org/"}],"dataAttribs":{"targetOff":26,"tsr":[0,43],"contentOffsets":[26,42],"a":{"href":"http://en.wikipedia.org/"},"sa":{"href":"his is Andrew's test wik"}}}
saved: []
ret : [{"type":"TagTk","name":"a","attribs":[{"k":"rel","v":"mw:ExtLink"},{"k":"href","v":"http://en.wikipedia.org/"}],"dataAttribs":{"targetOff":26,"tsr":[0,43],"contentOffsets":[26,42],"a":{"href":"http://en.wikipedia.org/"},"sa":{"href":"his is Andrew's test wik"}}}]
T:p-wrap:any: {"type":"TagTk","name":"a","attribs":[{"k":"rel","v":"mw:ExtLink"},{"k":"href","v":"http://en.wikipedia.org/"}],"dataAttribs":{"targetOff":26,"tsr":[0,43],"contentOffsets":[26,42],"a":{"href":"http://en.wikipedia.org/"},"sa":{"href":"his is Andrew's test wik"}}}

p-wrap:NL-count: 0
p-wrap:RET: []

T:p-wrap:any: "An external link"

p-wrap:NL-count: 0
p-wrap:RET: []

T:p-wrap:any: {"type":"EndTagTk","name":"a","attribs":[],"dataAttribs":{}}

p-wrap:NL-count: 0
p-wrap:RET: []

{

"0": "SyncTokenTransformManager.onChunk: emitting ",
"1": []

}
TOKS: [{"type":"EOFTk"}]
{

"0": "SyncTokenTransformManager.onChunk, input: ",
"1": [
  {
    "type": "EOFTk"
  }
]

}
{

"0": "SyncTokenTransformManager.onChunk: emitting ",
"1": [
  {
    "type": "EOFTk"
  }
]

}
A2-61: {"type":"EOFTk"}
{

"0": "maybeSyncReturn transforming",
"1": "c-173",
"2": {
  "tokens": [
    {
      "type": "EOFTk"
    }
  ]
}

}
{

"0": "maybeSyncReturn transforming",
"1": "c-173",
"2": {
  "tokens": [
    {
      "type": "EOFTk"
    }
  ]
}

}
{

"0": "firstAccum",
"1": "sync",
"2": "c-173",
"3": [
  {
    "type": "EOFTk"
  }
]

}
{

"0": "AsyncTokenTransformManager onChunk",
"1": "sync",
"2": [
  {
    "type": "EOFTk"
  }
]

}
emitting
{

"0": "SyncTokenTransformManager.onChunk, input: ",
"1": [
  {
    "type": "EOFTk"
  }
]

}
T:list:end {"type":"EOFTk"}
----closing all lists----
list:RET: [{"type":"EOFTk"}]
T:p-wrap:any: {"type":"EOFTk"}
T:p-wrap:NL: {"type":"EOFTk"}

p-wrap:NL-count: 0

{

"0": "SyncTokenTransformManager.onChunk: emitting ",
"1": [
  {
    "type": "TagTk",
    "name": "p",
    "attribs": [],
    "dataAttribs": {}
  },
  {
    "type": "TagTk",
    "name": "a",
    "attribs": [
      {
        "k": "rel",
        "v": "mw:ExtLink"
      },
      {
        "k": "href",
        "v": "http://en.wikipedia.org/"
      }
    ],
    "dataAttribs": {
      "targetOff": 26,
      "tsr": [
        0,
        43
      ],
      "contentOffsets": [
        26,
        42
      ],
      "a": {
        "href": "http://en.wikipedia.org/"
      },
      "sa": {
        "href": "his is Andrew's test wik"
      }
    }
  },
  "An external link",
  {
    "type": "EndTagTk",
    "name": "a",
    "attribs": [],
    "dataAttribs": {}
  },
  {
    "type": "EndTagTk",
    "name": "p",
    "attribs": [],
    "dataAttribs": {}
  },
  {
    "type": "EOFTk"
  }
]

}

  • HTML-61:<chunk> ----

chunk: [

{
  "type": "TagTk",
  "name": "p",
  "attribs": [],
  "dataAttribs": {}
},
{
  "type": "TagTk",
  "name": "a",
  "attribs": [
    {
      "k": "rel",
      "v": "mw:ExtLink"
    },
    {
      "k": "href",
      "v": "http://en.wikipedia.org/"
    }
  ],
  "dataAttribs": {
    "targetOff": 26,
    "tsr": [
      0,
      43
    ],
    "contentOffsets": [
      26,
      42
    ],
    "a": {
      "href": "http://en.wikipedia.org/"
    },
    "sa": {
      "href": "his is Andrew's test wik"
    }
  }
},
"An external link",
{
  "type": "EndTagTk",
  "name": "a",
  "attribs": [],
  "dataAttribs": {}
},
{
  "type": "EndTagTk",
  "name": "p",
  "attribs": [],
  "dataAttribs": {}
},
{
  "type": "EOFTk"
}

]
T:html: {"type":"TagTk","name":"p","attribs":[],"dataAttribs":{"tagId":1}}
inserting shadow meta for p
T:html: {"type":"TagTk","name":"a","attribs":[{"k":"rel","v":"mw:ExtLink"},{"k":"href","v":"http://en.wikipedia.org/"}],"dataAttribs":{"targetOff":26,"tsr":[0,43],"contentOffsets":[26,42],"a":{"href":"http://en.wikipedia.org/"},"sa":{"href":"his is Andrew's test wik"},"tagId":2}}
inserting shadow meta for a
T:html: "An external link"
T:html: {"type":"EndTagTk","name":"a","attribs":[],"dataAttribs":{}}
inserting shadow meta for a
T:html: {"type":"EndTagTk","name":"p","attribs":[],"dataAttribs":{}}
inserting shadow meta for p
T:html: {"type":"EOFTk"}

  • HTML-61:</chunk> ----

{

"0": "AsyncTokenTransformManager.onEndEvent: synchronous done",
"1": null

}

  • DOM: after tree builder ----

<html><head></head><body data-parsoid="{}"><!--{"@type":"mw:shadow","attrs":[{"name":"typeof","value":"mw:StartTag"},{"name":"data-stag","value":"body:undefined"}]}--><p data-parsoid="{&quot;tagId&quot;:1}"><!--{"@type":"mw:shadow","attrs":[{"name":"typeof","value":"mw:StartTag"},{"name":"data-stag","value":"p:1"}]}--><a rel="mw:ExtLink" href="http://en.wikipedia.org/" data-parsoid="{&quot;targetOff&quot;:26,&quot;tsr&quot;:[0,43],&quot;contentOffsets&quot;:[26,42],&quot;a&quot;:{&quot;href&quot;:&quot;http://en.wikipedia.org/&quot;},&quot;sa&quot;:{&quot;href&quot;:&quot;his is Andrew's test wik&quot;},&quot;tagId&quot;:2}"><!--{"@type":"mw:shadow","attrs":[{"name":"typeof","value":"mw:StartTag"},{"name":"data-stag","value":"a:2:0,43"}]}-->An external link</a><!--{"@type":"mw:shadow","attrs":[{"name":"data-parsoid","value":"{}"},{"name":"typeof","value":"mw:EndTag"},{"name":"data-etag","value":"a"}]}--></p><!--{"@type":"mw:shadow","attrs":[{"name":"data-parsoid","value":"{}"},{"name":"typeof","value":"mw:EndTag"},{"name":"data-etag","value":"p"}]}--></body></html>

  • DOM: pre-DSR -------

<head data-parsoid="{&quot;tmp&quot;:{}}"></head><body data-parsoid="{&quot;tmp&quot;:{}}"><p data-parsoid="{&quot;tagId&quot;:1,&quot;tmp&quot;:{}}"><a rel="mw:ExtLink" href="http://en.wikipedia.org/" data-parsoid="{&quot;targetOff&quot;:26,&quot;tsr&quot;:[0,43],&quot;contentOffsets&quot;:[26,42],&quot;a&quot;:{&quot;href&quot;:&quot;http://en.wikipedia.org/&quot;},&quot;sa&quot;:{&quot;href&quot;:&quot;his is Andrew's test wik&quot;},&quot;tagId&quot;:2,&quot;tmp&quot;:{}}">An external link</a></p></body>

WARNING: DSR inconsistency: cs/s mismatch for node: A s: 26; cs: 40 ------ DOM: post-DSR ------- <head data-parsoid="{&quot;tmp&quot;:{}}"></head><body data-parsoid="{&quot;tmp&quot;:{},&quot;dsr&quot;:[0,57,0,0]}"><p data-parsoid="{&quot;tagId&quot;:1,&quot;tmp&quot;:{},&quot;dsr&quot;:[0,57,0,0]}"><a rel="mw:ExtLink" href="http://en.wikipedia.org/" data-parsoid="{&quot;targetOff&quot;:26,&quot;tsr&quot;:[0,43],&quot;contentOffsets&quot;:[26,42],&quot;a&quot;:{&quot;href&quot;:&quot;http://en.wikipedia.org/&quot;},&quot;sa&quot;:{&quot;href&quot;:&quot;his is Andrew's test wik&quot;},&quot;tagId&quot;:2,&quot;tmp&quot;:{},&quot;dsr&quot;:[0,57,26,1]}">An external link</a></p></body> ---------------------------- ------ DOM: pre-encapsulation ------- <head data-parsoid="{&quot;tmp&quot;:{}}"></head><body data-parsoid="{&quot;tmp&quot;:{},&quot;dsr&quot;:[0,57,0,0]}"><p data-parsoid="{&quot;tagId&quot;:1,&quot;tmp&quot;:{},&quot;dsr&quot;:[0,57,0,0]}"><a rel="mw:ExtLink" href="http://en.wikipedia.org/" data-parsoid="{&quot;targetOff&quot;:26,&quot;tsr&quot;:[0,43],&quot;contentOffsets&quot;:[26,42],&quot;a&quot;:{&quot;href&quot;:&quot;http://en.wikipedia.org/&quot;},&quot;sa&quot;:{&quot;href&quot;:&quot;his is Andrew's test wik&quot;},&quot;tagId&quot;:2,&quot;tmp&quot;:{},&quot;dsr&quot;:[0,57,26,1]}">An external link</a></p></body> ---------------------------- ------ DOM: post-encapsulation ------- <head data-parsoid="{&quot;tmp&quot;:{}}"></head><body data-parsoid="{&quot;tmp&quot;:{},&quot;dsr&quot;:[0,57,0,0]}"><p data-parsoid="{&quot;tagId&quot;:1,&quot;tmp&quot;:{},&quot;dsr&quot;:[0,57,0,0]}"><a rel="mw:ExtLink" href="http://en.wikipedia.org/" data-parsoid="{&quot;targetOff&quot;:26,&quot;tsr&quot;:[0,43],&quot;contentOffsets&quot;:[26,42],&quot;a&quot;:{&quot;href&quot;:&quot;http://en.wikipedia.org/&quot;},&quot;sa&quot;:{&quot;href&quot;:&quot;his is Andrew's test wik&quot;},&quot;tagId&quot;:2,&quot;tmp&quot;:{},&quot;dsr&quot;:[0,57,26,1]}">An external link</a></p></body> ---------------------------- completed parsing of localhost:Main_Page in 258 ms

I suspect that something in the way you process the HTML modifies the data-parsoid value. Your HTML definitely differs from what I get when converting '[http://en.wikipedia.org/ An external link]' at http://parsoid-lb.eqiad.wikimedia.org/_wikitext/.

This is reproducible against the eqiad lb only by additionally passing a title url.

curl -d 'wt=[http://en.wikipedia.org/ external link]' http://parsoid-lb.eqiad.wikimedia.org/enwiki/Main_Page | tee /tmp/parsoid-html

% Total    % Received % Xferd  Average Speed   Time    Time     Time  Current
                               Dload  Upload   Total   Spent    Left  Speed

100 1252 100 1209 100 43 1957 69 --:--:-- --:--:-- --:--:-- 1969
<!DOCTYPE html>
<html prefix="dc: http://purl.org/dc/terms/ mw: http://mediawiki.org/rdf/" about="http://en.wikipedia.org/wiki/Special:Redirect/revision/574690625"><head prefix="mwr: http://en.wikipedia.org/wiki/Special:Redirect/"><meta property="mw:articleNamespace" content="0"/><link rel="dc:replaces" resource="mwr:revision/574690493"/><meta property="dc:modified" content="2013-09-27T03:10:17.000Z"/><meta about="mwr:user/153365" property="dc:title" content="Tariqabjotu"/><link rel="dc:contributor" resource="mwr:user/153365"/><meta property="mw:revisionSHA1" content="e0564e710b93f998658bd5527f0042eaba6d6c87"/><meta property="dc:description" content="removing unnecessary pipe"/><meta property="mw:parsoidVersion" content="0"/><link rel="dc:isVersionOf" href="en.wikipedia.org/wiki/Main_Page"/><title>Main Page</title><base href="en.wikipedia.org/wiki/Main_Page"/></head><body data-parsoid='{"dsr":[0,6391,0,0]}'><p data-parsoid='{"dsr":[0,6391,0,0]}'><a rel="mw:ExtLink" href="http://en.wikipedia.org/" data-parsoid='{"targetOff":26,"contentOffsets":[26,39],"a":{"href":"http://en.wikipedia.org/"},"sa":{"href":"!-- BANNER ACROSS"},"dsr":[0,6391,26,1]}'>external link</a></p></body></html>

:) (latinus:guy)-(0)-((no branch))(~/Jobs/mediawiki/vagrant/mediawiki/extensions/Flow)
curl -d "html=$(cat /tmp/parsoid-html)" http://parsoid-lb.eqiad.wikimedia.org/enwiki/Main_Page
[!-- BANNER ACROSS external link]

I personally confirmed that this is an issue on ee-flow.wmflabs.org (twice, cf. comment 2). I don't know if this is an issue in Parsoid or in Flow specifically, but it's definitely confirmed as an issue. :-)

Aha! I think I know what is going on. Instead of the passed-in wikitext the page's wikitext is used for the href source and dsr calculations.

Should be an easy fix in the web API.

Change 96667 had a related patch set uploaded by GWicke:
Bug 56820: Don't fetch wikitext source if oldid is missing

https://gerrit.wikimedia.org/r/96667

Change 96667 merged by jenkins-bot:
Bug 56820: Don't fetch wikitext source if oldid is missing

https://gerrit.wikimedia.org/r/96667

Change 101306 had a related patch set uploaded by GWicke:
Bug 56820: Don't fetch wikitext source if oldid is missing

https://gerrit.wikimedia.org/r/101306

Change 101306 merged by GWicke:
Bug 56820: Don't fetch wikitext source if oldid is missing

https://gerrit.wikimedia.org/r/101306