Page MenuHomePhabricator

Edits from various users active before 2003 are missing from Special:Contributions
Open, LowPublic

Description

Author: timwi

Description:
BUG MIGRATED FROM SOURCEFORGE
http://sourceforge.net/tracker/index.php?func=detail&aid=917040&group_id=34373&atid=411192
Originally submitted by Toby Bartels (tobybartels) 2004-03-16 03:04

On the English Wikipedia in 2001, there was a user
"Ryan_Lackey" whose user name contained an underscore.
You can see edits credited to this user at, for
example, [[Talk:Sealand]]. But these edits are not
recorded at
[http://en.wikipedia.org/w/wiki.phtml?title=Special:Contributions&target=Ryan_Lackey],
which, after all, ''should'' be for a user named "Ryan
Lackey" (who doesn't exist). Similarly,
[[User:Ryan_Lackey]] doesn't think that it's a user
page for an actual user.

This specific case can probably be fixed if a developer
performs a username change (from "Ryan_Lackey" to "Ryan
Lackey") -- assuming that the name changing feature
doesn't break down too! ^_^ But the larger bug probably
applies to other editors from Phase I.


Version: unspecified
Severity: normal
See Also:
https://bugzilla.wikimedia.org/show_bug.cgi?id=34873

Details

Reference
bz323

Revisions and Commits

Event Timeline

There are a very large number of changes, so older changes are hidden. Show Older Changes

I just found a case of this bug where the user had an underline in their name but the account was subsequently taken over by a vandal. I thought I had created accounts for all UseModWiki-era users who didn't have them, but users have occasionally slipped through the cracks. See:
http://en.wikipedia.org/wiki/User_talk:The_ansible#Note

I have also known for a long time about the case of "Simon_J_Kissane", see:
http://en.wikipedia.org/w/index.php?title=Talk:Algebraic_number&oldid=234678

Are we supposed to list all cases we find (as in [[bugzilla:20757]])? Because I just ran across [[User:Alan_D]]: http://en.wikipedia.org/w/index.php?title=Sailor_Moon&oldid=282114 for example does not show up in his contributions.

@ Philip #15

I don't know who decides what we're "supposed" to do, but I think that it would be a good idea, at least until a developer writes in to say that there's no point.

There's no point in listing them as far as I can tell. The devs can find them all automatically if they use the method I outlined in comment 11. I don't see the point of listing all instances at bug 20757 either, but it's better to be safe than sorry.

As an aside, to draw more attention to this bug, I've mentioned it at http://en.wikipedia.org/wiki/Wikipedia:MediaWiki/DeveloperMemo/November2009#Requests_-_fixes

ayg wrote:

There is no point in listing them one by one here. Anyone with even toolserver access can just query the appropriate tables to find the bad rows. E.g., on enwiki,

mysql> SELECT user_name FROM user WHERE user_name LIKE '%\_%';
+-----------------+

user_name

+-----------------+

Nicholas_Lativy

+-----------------+
1 row in set (1 min 13.18 sec)

The same can just as easily be done for the other wikis, and other tables.

Curiously, the import feature seems to convert underlines to spaces in usernames automatically. I just imported some history from Nostalgia Wikipedia to the English Wikipedia, thanks to bug 20280. Larry Sanger's early contribution list, especially before January 2002, is now quite interesting:

http://en.wikipedia.org/w/index.php?title=Special:Contributions&dir=prev&target=Larry+Sanger

Not really relevant to this bug, but importing such edits also causes diff sizes to be generated for them.

Re: Comment 12, the problem is not the at being changed to a dot, but the fact that the first letter of the username contains a lower-case letter. I've changed the bug name accordingly to take this into account.

Therefore I would consider this bug resolved if someone changed underlines to spaces in the username fields as described in comment 11, then used the same procedure to change initial lower-case letters in usernames to capital letters. The change in the user ID number would be nice, but not strictly necessary, and it would probably be more trouble than it's worth.

And it goes without saying that I'd like this bug fixed on all applicable wikis, not just the English Wikipedia. I'm particularly thinking about the Nostalgia Wikipedia here, but other WMF projects might be affected as well.

What WMF projects besides the en.wp were active back on the Phase I software, anyways?

(In reply to comment #23)

What WMF projects besides the en.wp were active back on the Phase I software,
anyways?

Plenty of them. Compare http://meta.wikimedia.org/w/index.php?title=Wikipedia_software_upgrade_status&oldid=2478 and http://en.wikipedia.org/w/index.php?title=Wikipedia:Complete_list_of_language_wikis_available&oldid=353094 ... that's only the Wikipedias.

It occurs to me that it might be easier to fix this bug by changing the Special:Contributions and deleted contributions pages to check for table rows with underlines and initial lower-case letters in the usernames.

This bug also affects some usernames from the Phase II software (which was used in the English Wikipedia from January to July 2002), so I've changed the bug title accordingly. See this edit to "military history":
http://en.wikipedia.org/w/index.php?title=Military_history&diff=98562&oldid=74194

(In reply to comment #24)

It occurs to me that it might be easier to fix this bug by changing the
Special:Contributions and deleted contributions pages to check for table rows
with underlines and initial lower-case letters in the usernames.

And it now occurs to me that fixing the problem by changing the contributions special pages, rather than changing the entries in the database, wouldn't fix the problem with importing edits in comment 19. See this page in my userspace:

http://en.wikipedia.org/wiki/User:Graham87/Import#Underline
Therefore my idea in comment 25 would be a second-rate solution.

In the revision table of the Nostalgia Wikipedia, one of the usernames listed is "Brad_", so it was apparently possible for usernames to end in underlines in the phase I and II software.
In these cases, these usernames should probably be changed to "Brad old" or something similar. Replacing the underlines with spaces in this case would produce the username "Brad ", and the space at the end would still make the username invalid.

At the moment, I'm creating English Wikipedia accounts for all usernames that existed in the Nostalgia Wikipedia. Therefore, almost all of the usernames affected by this bug in the English Wikipedia will have a dummy account associated with them.

I've found some edits where the username is stored in the database with two consecutive spaces. None of these edits can be found through the user contributions list. I have changed the bug summary accordingly. In this diff, the extra space is not apparent when looking at the page in a browser, but it is obvious when checking the HTML source code:
http://en.wikipedia.org/w/index.php?title=Theosophy&diff=291009&oldid=291008

Here is an example from Meta of a username with a lower-case letter from the Phase II software:
http://meta.wikimedia.org/w/index.php?title=Market_research_in_progress_at_meta_wikipedia&action=history

I've also changed the bug summary to be more informative.

Hmmm, this is probably due to the facte that the rev_user field is non-zero for each of the edits listed in those two links, and in fact is linked to the user ID of the user who made the edit; this never happens in the English Wikipedia, so these methods cannot be used there. The rev_user field shows the user ID of the editor who made a particular edit; the equivalent field in the archive table is ar_user. The user ID for an edit is always 0 for anonymous editors, mass-imports and scripts; it isn't usually zero for normal registered users. If the user ID given for an edit made by a registered user is 0, then the "contribs" link won't show up for the user in the page history. This example comes from a mistaken import, but it is illustrative:
http://en.wikipedia.org/w/index.php?title=Ocean&diff=331657005&oldid=271538

No contribs are found for Ryan_Lackey (see top of bug report) in the API of the English Wikipedia, because none of his edits have an associated non-zero user ID:
http://en.wikipedia.org/w/api.php?action=query&list=usercontribs&ucuser=Ryan_Lackey

On the examples from Meta: note that in the history (and also Special:undelete) the links to user page a user talk are red even if the pages actually exist (I'm adding also a screenshot for future reference).

Created attachment 7634
Red link to existing lower case user and user talk pages

See bug 323 comment 35.

Attached:

red_link_to_existing_lowercase_talk_page.png (768×1 px, 129 KB)

I've changed the summary once again, so it shows the correct fields!

Thanks to [[it:User:Mauro742]] you can now find the complete list of all 4336 en.wiki affected revisions at [[User:Nemo_bis/Bug 323 revisions]].

Some edits of renamed users are affected, too, and have not been moved to the new username: compare http://meta.wikimedia.org/w/index.php?title=Special:Undelete&target=Native+American+Affairs&timestamp=20020127003202 by [[m:user:maveric149]] (lowercase: see also [[m:Special:Contributions/maveric149]] which for some reason is not empty) and http://meta.wikimedia.org/w/index.php?title=Wikimedia_bank_account_history_for_2004&action=history which was created after the user was renamed to Daniel_Mayer (http://meta.wikimedia.org/w/index.php?title=User:Maveric149&diff=1352761&oldid=157121) and is now under the correct username Mav (I've just restored this page).

See also bug 3507, dealing with the usernames themselves instead of edits attributed to those users.

deblocking from 29757, these have nothing to do with user renames, they are caused from user accounts predating phase3 (aka mediawiki as we know it today)

epriestley closed this task as Resolved by committing Unknown Object (Diffusion Commit).Mar 4 2015, 8:19 AM
epriestley added a commit: Unknown Object (Diffusion Commit).

How did you manage to resolve this? The databases are not currently fixed; is there a script ready to be run that will do this? @epriestley

TTO subscribed.

This was erroneously closed due to a bug.

Krinkle renamed this task from edits where the rev_user_text and ar_user_text fields contain underscores, initial lower-case letters or consecutive spaces, which could occur in the Phase I and II software, are inaccessible using Special:Contributions to Edits from various users active before 2003 are missing from Special:Contributions.Jul 28 2018, 10:31 PM
Krinkle moved this task from Untriaged to Usage problem on the MediaWiki-libs-Rdbms board.
Krinkle removed a subscriber: wikibugs-l-list.

Is this bug in the process of being fixed, or has some script been run that has fixing this bug as a side effect? See "Usernames_underlined_or_with_lowercase_letters" at User talk:Graham87/Import. The edits display with the correct username but aren't in special:contributions for that username yet .... this can most easily be demonstrated by this diff, with an edit from 17 January 2002 (UTC), and this contribs page, which is set to display edits from 25 January 2002 or earlier but whose latest edit is from 6 January 2002 (UTC).

The diff you link is attributed to the invalid username "Larry_Sanger" with an underscore (see the API output for that revision), while the contributions page is for user "Larry Sanger" with a space. That's why the edit doesn't show up on that contributions page. The display of the name without an underscore on the diff page is incorrect, as a side effect of MediaWiki normalizing the underscore to space when making a User object for the name.

There has been some cleanup at the database level as a side effect of the actor table migration (see T188327). Specifically, in cases where the revision had a non-zero rev_user that didn't match the username in rev_user_text, the migration preferred the rev_user. So if an enwiki revision had rev_user as 216 it was assigned to "Larry Sanger" (with a space) even if rev_user_text had been "Larry_Sanger" (with an underscore). It looks like there were 135 such revisions. On the other hand, if a revision had zero in rev_user, it would have remained attributed to "Larry_Sanger" (with an underscore). It looks like there were 1815 of those.

Wow, makes sense. Pretty weird though.

The bug is that that user account exists in the first place