Page MenuHomePhabricator

Username to user_id match is inconsistent in revisions of dump.
Closed, InvalidPublic

Description

Username to user_id match is inconsistent in revisions of dump.
This could be a characteristic of how and when the username field gets updated in the revision table. If so, it would be nice to have a clear explanation of what things to expect (e.g. deleted users, username changes, etc).
We see a range of inconsistencies along the lines of many usernames matched with the same ID, many IDs matched with the same username, non-ip usernames with no ID and completely missing user information.
Our approach is to associate a user_id with its most recent username and propagate this username to all instances of user_id.

Proposed solution:

  1. Run SQL query to synchronize usernames with userids.
  2. Run SQL query to replace cases hostname is the username.

Version: unspecified
Severity: minor

Details

Reference
bz27774

Event Timeline

bzimport raised the priority of this task from to Medium.Nov 21 2014, 11:27 PM
bzimport set Reference to bz27774.

The dump will just list whatever's in the table; each revision record carries both a rev_user and a rev_user_text field. Depending on how imports, user renames, and database cleanup proceed over time, it's normal to have some inconsistencies in there:

  • with user_id of 0, you will sometimes see non-IP user names that elsewhere appear with a legit user id
  • sometimes you will find a legit user_id paired with an old username

You should consider the username listed in a revision record to be merely advisory when a non-zero user ID is present. Current user information ought in theory to be available in a more canonical form somewhere else...

Nemo_bis subscribed.

No indication provided that this was a real bug.