Page MenuHomePhabricator

Special:Import: "XML import parse failure" and wrong number of imported revisions
Closed, DeclinedPublic

Description

I tried to transwiki [[en:Legion (demon)]] into de.wiki.

1st try: "No pages to import. Import failed: XML import parse failure at line
5270, col 12 (byte 546783; " "): Invalid document end"

2nd try: "No pages to import. Import failed: XML import parse failure at line
6677, col 12 (byte 694275; " "): Invalid document end"

3rd try: Success, but the log shows 27 revisions:
http://de.wikipedia.org/w/index.php?title=Spezial%3ALogbuch&type=import&user=Raymond&page=Legion+%28demon%29&uselang=en

in reality it's > 100 revisions:
http://de.wikipedia.org/w/index.php?title=Benutzer:Owly_K/Legion_%28D%C3%A4mon%29&action=history


Version: 1.11.x
Severity: normal
Whiteboard: aklapper-moreinfo

Details

Reference
bz9911

Event Timeline

bzimport raised the priority of this task from to Low.Nov 21 2014, 9:41 PM
bzimport set Reference to bz9911.
bzimport added a subscriber: Unknown Object (MLST).

The first two failures may have been communications errors or some other kind of
problem, details uncertain.

Both would have imported some limited number successfully, but the log would not
have been updated.

The third would have imported the remaining entries which were not already
imported, and was thus correctly logged with the 27 remaining entries.

  • Bug 10504 has been marked as a duplicate of this bug. ***

wikt.3.connelm wrote:

As requested on IRC, here is an import that fails the same way *every day* when my script runs:

From en.wiktionary Special:Import, importing [[w:Ignorance]] into the Wiktionary Transwiki: namespace. (Checked: Copy all history versions for this page)

Import pages
From Wiktionary
Jump to: navigation, search

Importing pages...

  • No pages to import.

Import failed: XML import parse failure at line 6894, col 12 (byte 658379; " "): Invalid document end

Next, when futzing with it manually, I un-check "Copy all history versions for this page" and try manually...

...and because Brion asked me to make this friggin' report today, it (for the first time) works. No fair!

When I tried this manually, last, it was the same "cannot open log" thing. Grrr. OK, so re-checking the "copy all revisions" thing, I try it again. And again. WTF? 108 versions imported? How dare it work, when I'm trying to demonstrate that it doesn't work?

wikt.3.connelm wrote:

Fine. Trying another example that has been failing for over a week, every day (two weeks? Who knows.)

From en.wikt:, importing [[w:List of SMS abbreviations]] to en.wikt's Transwiki: ns, all versions.

Import pages
From Wiktionary
Jump to: navigation, search

Importing pages...

  • No pages to import.

Import failed: XML import parse failure at line 43779, col 12 (byte 1400655; " "): Invalid document end

Second try:

Import pages
From Wiktionary
Jump to: navigation, search

Importing pages...

  • No pages to import.

Import failed: XML import parse failure at line 87680, col 8 (byte 2843634; ""): Invalid document end

Third try:

Import pages
From Wiktionary
Jump to: navigation, search

Importing pages...

  • Transwiki:List of SMS abbreviations 29 revisions

Import succeeded!

STOP IT! STOP IT I SAY! Grrrrrrrrrrrrrrrr

wikt.3.connelm wrote:

Trying again with [[w:List of French phrases used by English speakers]], I get "Couldn't open import file" the first four tries, then

Import failed: XML import parse failure at line 85209, col 1 (byte 6059085; ""): Invalid document end

then

Import failed: XML import parse failure at line 121845, col 69 (byte 8493987; ""): Invalid document end

Import failed: XML import parse failure at line 122327, col 24 (byte 8524352; ""): Invalid document end

Import failed: XML import parse failure at line 150448, col 5 (byte 10355219; ""): Invalid document end

Import failed: XML import parse failure at line 188653, col 241 (byte 13122542; ""): Invalid document end

Import failed: XML import parse failure at line 145520, col 12 (byte 10026102; ""): Invalid document end

Import failed: XML import parse failure at line 179175, col 231 (byte 12384013; ""): Invalid document end

Import failed: XML import parse failure at line 131547, col 165 (byte 9106663; ""): Invalid document end

Import failed: XML import parse failure at line 185168, col 67 (byte 12846535; ""): Invalid document end

Import failed: XML import parse failure at line 136635, col 134 (byte 9435729; ""): Invalid document end

Import failed: XML import parse failure at line 132979, col 82 (byte 9198574; ""): Invalid document end

etc.

(Wish granted. It went back to breaking. :-( )

wikt.3.connelm wrote:

(cont.)

Import failed: XML import parse failure at line 184308, col 84 (byte 12778764;
""): Invalid document end

  • Served by srv136 in 5.508 secs. --

Import failed: XML import parse failure at line 190401, col 2 (byte 13262462;
""): Invalid document end

  • Served by srv125 in 6.480 secs. --

wikt.3.connelm wrote:

Import failed: XML import parse failure at line 186904, col 91 (byte 12984078; ""): Invalid document end

  • Served by srv145 in 5.446 secs. --

then
Import failed: XML import parse failure at line 131059, col 171 (byte 9075779; ""): Invalid document end

  • Served by srv84 in 5.660 secs. --

Import failed: XML import parse failure at line 134832, col 18 (byte 9318267; ""): Invalid document end

  • Served by srv93 in 5.561 secs. --

Import failed: XML import parse failure at line 161222, col 47 (byte 11083452; ""): Invalid document end

  • Served by srv72 in 5.713 secs. --

wikt.3.connelm wrote:

Import failed: XML import parse failure at line 86885, col 8 (byte 6169132; ""): EntityRef: expecting ';'

  • Served by srv41 in 4.611 secs. --

mike.lifeguard+bugs wrote:

At [[Help:Import#Large-scale transfer]] it says "For a large-scale transfer, somebody with sufficient system privileges can move data within the server, which is more practical than sending large XML files from the server to a user's local computer and then back to the server." Is this a potential workaround for failed imports?

mike.lifeguard+bugs wrote:

Using WhatLinksHere for Template:Import failed at en.wikibooks may yield some useful data on which pages made full-history import break and when.

http://en.wikibooks.org/wiki/Special:WhatLinksHere/Template:Import_failed

Mass compoment change: <some> -> Export/Import

Note that this bug was filed when Import was using the expat-based "XML Parser" PHP framework.

Currently, Import uses XMLReader (a somewhat crappy libxml wrapper). The only way to get proper error handling from this is to call global config functions like libxml_use_internal_errors(true). Given that other XMLReader global config functions have other issues with thread safety, I would be very nervous about trying that.

Having said that, I notice no-one has commented here since 2008, and XMLReader has been implemented since then, so I wonder if this can be closed...

(In reply to This, that and the other (TTO) from comment #12)

I notice no-one has commented here since 2008, and XMLReader has been
implemented since then, so I wonder if this can be closed...

Closing. If this is still a problem nowadays, anybody please file a new report.