Page MenuHomePhabricator

Audit centralauth database for inconsistencies
Closed, ResolvedPublic

Description

Inspired by bug 66535.

Since we have a checkLocalNames script, I started running it against small.dblist (except for loginwiki, which has a huge user table).

aywiki: 2014-07-01 07:27:48 found 1 invalid localnames, 0 (0.0%) deleted
chrwiki: 2014-07-01 07:29:09 found 2 invalid localnames, 0 (0.0%) deleted
csbwiki: 2014-07-01 07:29:32 found 1 invalid localnames, 0 (0.0%) deleted
furwiki: 2014-07-01 07:31:32 found 1 invalid localnames, 0 (0.0%) deleted
idwikiquote: 2014-07-01 07:32:49 found 1 invalid localnames, 0 (0.0%) deleted
iewiki: 2014-07-01 07:32:57 found 1 invalid localnames, 0 (0.0%) deleted
iuwiki: 2014-07-01 07:33:15 found 1 invalid localnames, 0 (0.0%) deleted
kaawiki: 2014-07-01 07:33:32 found 1 invalid localnames, 0 (0.0%) deleted
kshwiki: 2014-07-01 07:34:17 found 1 invalid localnames, 0 (0.0%) deleted
kswiki: 2014-07-01 07:34:20 found 1 invalid localnames, 0 (0.0%) deleted
lnwiki: 2014-07-01 07:34:51 found 1 invalid localnames, 0 (0.0%) deleted
lowiki: 2014-07-01 07:38:24 found 1 invalid localnames, 0 (0.0%) deleted
pdcwiki: 2014-07-01 07:40:14 found 1 invalid localnames, 0 (0.0%) deleted
sswiki: 2014-07-01 07:42:36 found 1 invalid localnames, 0 (0.0%) deleted
tewikiquote: 2014-07-01 07:43:32 found 1 invalid localnames, 0 (0.0%) deleted
viwikibooks: 2014-07-01 07:44:56 found 1 invalid localnames, 0 (0.0%) deleted
xalwiki: 2014-07-01 07:45:47 found 1 invalid localnames, 0 (0.0%) deleted
xhwiki: 2014-07-01 07:45:50 found 1 invalid localnames, 0 (0.0%) deleted

We will need to optimize the script to run on any larger wikis.


Version: unspecified
Severity: normal
See Also:
https://bugzilla.wikimedia.org/show_bug.cgi?id=66535

Details

Reference
bz67350

Related Objects

View Standalone Graph
This task is connected to more than 200 other tasks. Only direct parents and subtasks are shown here. Use View Standalone Graph to show more of the graph.

Event Timeline

bzimport raised the priority of this task from to High.Nov 22 2014, 3:40 AM
bzimport set Reference to bz67350.

Thanks! Was it the same username across those wikis that was invalid? It just seems odd that there is exactly 1 invalid name on each wiki you tried.

Are you planning to run migratePass0 to make sure localnames has all of the local names too?

(In reply to Chris Steipp from comment #1)

Thanks! Was it the same username across those wikis that was invalid? It
just seems odd that there is exactly 1 invalid name on each wiki you tried.

To clarify, I ran the script on 524 wikis, and the above lines were just the ones that had invalid localnames. I grabbed it manually from the scripts output, and apparently I missed some.

I re-ran it and saved the output this time (/home/legoktm/bad_local_names.txt on terbium). There are 111 entries in total.

Glancing through it, it looks like most of the users are different except for one which stands out. [[m:Special:CentralAuth/Saerahb]] which has invalid entries on 14 wikis.

Are you planning to run migratePass0 to make sure localnames has all of the
local names too?

Yeah, that's a good idea.

542 invalid localnames rows, and 498 invalid localuser rows.

Most of the bad users in localuser are also present in localnames.

I'll first clean out the bad localnames/localuser rows and then run migratePass0 across all wikis.

We should also check the globalnames table once localnames is fixed.

[12:37:38 AM] <legoktm> !log started running checkLocalNames.php --delete=1 on all CentralAuth wikis for bug 67350
[12:39:20 AM] <legoktm> !log started running checkLocalUser.php --delete=1 on all CentralAuth wikis for bug 67350

I'm guestimating that will take a full day, maybe even the weekend.

Created attachment 15924
counts of deletions by of checkLocalNames.php

Attached:

Created attachment 15925
counts of deletions by checkLocalUser.php

Attached:

I attached the final counts of how many rows were deleted. I have complete logs including which specific rows were deleted in my home on terbium.

(In reply to Kunal Mehta (Legoktm) from comment #5)

I'm guestimating that will take a full day, maybe even the weekend.

Or not! I'll start migratePass0 now.

(In reply to Kunal Mehta (Legoktm) from comment #8)

Or not! I'll start migratePass0 now.

Done.

Now that we know localnames is consistent, we just need to check/fix globalnames. Will need a new script for that.

Curious that wikidata and nsowiki had so many:

$ grep -E "[0-9]{2,} invalid" attachment.cgi\?id\=15925
enwiki: 2014-07-12 12:08:07 found 50 invalid localuser, 50 (100.0%) deleted
metawiki: 2014-07-12 17:29:59 found 52 invalid localuser, 52 (100.0%) deleted
nsowiki: 2014-07-12 17:43:53 found 84 invalid localuser, 84 (100.0%) deleted
wikidatawiki: 2014-07-12 19:26:20 found 119 invalid localuser, 119 (100.0%) deleted

Keegan subscribed.

@Legoktm said this can be closed as resolved, so I'm doing that.