Page MenuHomePhabricator

Don't autologin if local account doesn't exist (don't autocreate if user doesn't explicitly login)
Open, MediumPublicFeature

Description

The exact scenario is described in the non-public OTRS wiki: http://otrs-wiki.wikimedia.org/wiki/User:Church_of_emacs/mw_critical_bug

I remember having a discussion with Tim Starling and Brion Vibber when Wikimedia began to automatically create accounts on normal GET-requests. Tim and I had some concerns, while Brion rightfully said that the information of user:abc visiting a Wikipedia version on a specific date was a minor privacy concern. However, in the scenario (see URL) it seemed like this feature could lead to much more private information leaking.

Therefore I propose to disable automatic account creation on GET-requests and instead use only POST-requests to create accounts.


Version: unspecified
Severity: enhancement
URL: https://bugzilla.wikimedia.org/show_bug.cgi?id=19161#c4
See also:

Details

Reference
bz19161

Related Objects

Event Timeline

There are a very large number of changes, so older changes are hidden. Show Older Changes

The problem with automatic account creation is that it creates an authenticated account but with the wrong preferences : the default preferences that are set on the new wiki allow anyone on that wiki to send an email directly to that person, just by posting a message into his discussion page on that wiki. This should not be the default : sending emails must not be enabled by default, or should be restricted to administrators of that wiki, and not permitted to any visitor of that wiki.

In fact, I have now several hundreds of accounts in Wikimedia projects, and I can't control the preferences that have been set on all of them, it is too much !

I'd like to have the privacy parameters globalized too, or at least that the default account parameters for the created account to be those on the origin wiki on which the account was authenticated by SUL !

And it would be great if we had the option to indicate, in the preferences dialog, the privacy parameters that should be saved in the global account (or to have a specific additional preferences pane for these global SUL parameters, and then have the possibility, in that "SUL" pane, to apply these global parameters to ALL wikis where the SUL account is activated, without having to revisit each wiki, notably when we have hundreds of them now !

To apply the SUL privacy parameters to all wikis, there could be probably some delay, because the SUL server would have to act like a bot on all wikis to update them in a background task, if applying the parameters to all wikis would take too much time (and too many SQL requests on many wikis). The completion status of this update could be displayed "update in progress", so that the user will know when the update has been effectively completed.

Independantly of that Bot update task, the user could also visit one or several of these wikis during the update in progress: when he will do that on these wikis specifically, these wikis will also be able to see that there's an "update requested" in the SUL account, and could also perform their own local update immediately. the same will also occur if anyone attempts to reference the user's home page on that wiki:

So the SUL update Bot would have the same effect as if the user was visiting successively each wiki registered in his SUL account, except that it would be a bit faster, without forgetting any one of them: the SUL update bot would just have to visit the user's own page on each wiki (it would just need to perform a HTTP "STAT" on that user's page, without actually reading it, and the update would effectively not be performed by the Bot itself but by each wiki site, just noting that a user's page or subpage is being accessed and then checking the user's SUL account state).

No need to do bot tasks.
User:getOption() would provide a hook for SUL, which would check the central
preferences and decide your effective preferences.
You probably want to have your global account parameters as defaults and local
preferences override that.
It's a bit more complex than setting user_preferences as a shared table (and
that would be undesirable, too).
I recommend you to open a new bug for that (if it doesn't exist yet).

I DID NOT speak about a SHARED table. Even if there are preferences in the SUL database, these preferences should remain distinct and separated from the preferences set in specific wikis (which should still override what is set in the SUL account, until the user wants to apply the SUL preferences globally to all wikis).

That's why it would require some BOT-like behavior, if and only if, the user wants to apply the SUL preferences to all his wikis (and notably all those that were registered automatically by just visiting them without performing any edit).

In fact I really wonder why an account is created immediately on a wiki, by just visiting it. Mediawiki just needs to track in the session that the user has a subscribed SUL account. This account should then be created only when the user performs an actual edit on the visited wiki, or when he clicks on his preferences. In which case, the account preferences (including the locale parameters like the language and localtime, and privacy parameters like the effective user name confirmed email address) will be automatically imported from the SUL account.

As long as such edit (or preference settings) has not been performed and effectively saved, the user name shown at top will display a red link but can still display the SUL user name.

Note that I would really like that the language be part of the shared preferences, because it is sometime quite hard to use a foreign language wiki with its default language: you have to guess where are the links to the user page and talk page, and user preferences, and have to decipher the prefernces panel in a language that you possibly cannot even read (despite they are important for effectively managing your privacy options).

Why do I ask a way to apply SUL preferences globally using a delayed, bot-like system? It's mainly because now we have really a lot of accounts created in many languages that most users can't even read. so now they can't change their privacy options that have been imported with the wrong default values in the wikis that they have just visited (for example to check the existence of a page in a interwiki link).

We have now too many accounts in the system that most users can't even manage themselves, because of the languages that they can't even read (or that their system cannot even display properly with appropriate fonts when those languages are using non-Latin scripts).

I think that this is really important to fix: offer them a way to effectively manage their existing local accounts that have already been created automatically under SUL, without forcing them to learn these foreign languages or scripts!

Applying the global SUL preferences to the local wikis will require a lot of updates on a lot of distinct databases. That's why it cannot be performed by a simple hook, but by a scheduled Bot task on the server. But using the SUL parameters for all wikis will also be a bad option: there will continue to exist some local wikis for which a user does not want to use the default preferences stored in their SUL account.

(In reply to comment #14)

I DID NOT speak about a SHARED table. Even if there are preferences in the SUL
database, these preferences should remain distinct and separated from the
preferences set in specific wikis (which should still override what is set in
the SUL account, until the user wants to apply the SUL preferences globally to
all wikis).

That would be done much better with a shared wiki table than with some crappy bot. It would be simple to select which preferences should be shared.

That's why it would require some BOT-like behavior, if and only if, the user
wants to apply the SUL preferences to all his wikis (and notably all those that
were registered automatically by just visiting them without performing any
edit).

That could be done *easily* without a bot.

In fact I really wonder why an account is created immediately on a wiki, by
just visiting it. Mediawiki just needs to track in the session that the user
has a subscribed SUL account. This account should then be created only when the
user performs an actual edit on the visited wiki, or when he clicks on his
preferences. In which case, the account preferences (including the locale
parameters like the language and localtime, and privacy parameters like the
effective user name confirmed email address) will be automatically imported
from the SUL account.

Well, if you wanted preferences to be applied across wikis, automatic account creation would be required.

[snipped]

Applying the global SUL preferences to the local wikis will require a lot of
updates on a lot of distinct databases. That's why it cannot be performed by a
simple hook, but by a scheduled Bot task on the server. But using the SUL
parameters for all wikis will also be a bad option: there will continue to
exist some local wikis for which a user does not want to use the default
preferences stored in their SUL account.

Or alternatively, common storage, instead of storing the same thing a million times.

First of all, I did not speak about a real bot, but about a bot-LIKE feature, but still involving a delayed update, where the various wikis would enforce the applied update each time they encounter a reference to a local user account, before assuming that the locally stored preferences can be safely used.

Well, if you wanted preferences to be applied across wikis, automatic account
creation would be required.

Absolutely NOT ! When I refer to preferences to be applied across wikis, it just has to apply to existing accounts, whever they were created automatically or not, the only condition being that these already existing wiki accounts are effectively unified with the SUL account.

Or alternatively, common storage, instead of storing the same thing a million times.

It is already stored a "million" times : each wikis already stores these preferences locally, as well as the existing SUL server for these global preferences.

But using a single storage ONLY in SUL, but not locally would impact users : they won't be able to have local preferences, when they want to keep them. For example, users may still want to have different email options for specific wikis (such as forwarding or not the messages posted in their discussion page), even if they use the same validated email address on all of them: one good reason for that is that some wikis are handled by teams of local administrators that a user can trust on some wikis, but absolutely not on all wikis.

The other reason being that users are multilingual and will still want, for example, to work on an English Wiki with the English language and localisation parameters, and on a French Wiki with the French language. They will however not want to work on Arabic wikis with the Arabic language, and will want to apply a default language for all languages that they don't want to see in the user interface (and notably not in the user's preferences panel).

If you just share a common table, it will no longer be possible to work on wikis with distinct languages when they want, and they will have to accept forwarded emails from ALL wikis, or from NONE of them.

(In reply to comment #16)

Or alternatively, common storage, instead of storing the same thing a million times.

It is already stored a "million" times : each wikis already stores these
preferences locally, as well as the existing SUL server for these global
preferences.

But using a single storage ONLY in SUL, but not locally would impact users :
they won't be able to have local preferences, when they want to keep them. For
example, users may still want to have different email options for specific
wikis (such as forwarding or not the messages posted in their discussion page),
even if they use the same validated email address on all of them: one good
reason for that is that some wikis are handled by teams of local administrators
that a user can trust on some wikis, but absolutely not on all wikis.

It's simple to have local overrides and still use common storage for defaults.

You don't need to speculate about the implementation details, just specify the features that you want.

This is way off the topic of this bug. To discuss global preferences, please see bug 14950.

I am NOT "speculating" about implementation details. I am only refering to the visible effects of these implementations. This is really different.
Some recent changes in some wikis have been made recently (even after this bug was discussed here) that caused similar problems:
a new functionality was added without notice in some rarely visited wiki which caused that wiki to send emails for each page that was previously just part of the list of pages to monitor ONLINE (whcich was not an option to monitor them offline by email).

Think twice when adding new fucntionality that invades user privacy or user's mailboxes and make this option active by default.

Anyway thanks for pointing bug 14950 (which is now really old and more urgently needs a fix now). We do need global overrides (independantly of how it will be implemented) and we still need some local overrides that will allow users to bypass some global defaults in some specific wikis (including their "home" wiki for SUL) that users can "trust" and that they reagularly visit for participating actively.

Is this a issue? we now have a tickbox to disable global login....

Yes it is still an issue, because we can still want to have global login, without necesarily autosubscribe to various privacy settings (notably email notifications) just because we randomly visit a new wiki.
The global login is there to help autoconnect when we will really need to interact with a wiki, so that we can just reserve the login name. But this autologin feature does not mean that we have ever seen and checked those privacy options on the per-wiki settings page.
I'd still like to have the global login enabled, but all wikis set by default with the maximum privacy, instead of the default permissive ones.
Ideally, if the global login is enabled, the automatic account creation should immediately create a link to our home wiki on the created account (this means creating a home page automatically with a basic content, including the list of languages that he advertizes on his home wiki, and posting a top message in the user discussion page, stating that the user can be contacted on his home wiki), and all email settings disabled until we explicitly check these settings on specific wikis.
And there is still the lack of possibility for changing in one single screen any given privacy option on all wikis, or to know on which wiki it has been enabled/disabled (there should exist a link displaying which wiki has an option enabled or disabled or different from the default global setting).
When you have an account enabled with more than 200 wikis, just because you have just visited a few pages found when navigating from our home wiki, it becomes extremely difficult to know when, and how the newly discovered wikis have had a new privacy option implemented and which ones need to be updated.
Ideally, all privacy settings should not just display "enabled" or "disabled" options, but also a "default/shared" option indicating that the option will be inherited from the global settings. What this means is that the "global" settings should act as if it was another wiki, but it could also mean the settings from an existing wiki (our "home" wiki that we should also be allowed to change within one of those for which we already have an account, preferably a wiki where our account is already verified).

(In reply to comment #20)

Ideally, if the global login is enabled, the automatic account creation should
immediately create a link to our home wiki on the created account (this means
creating a home page automatically with a basic content, including the list of
languages that he advertizes on his home wiki, and posting a top message in the
user discussion page, stating that the user can be contacted on his home wiki),
and all email settings disabled until we explicitly check these settings on
specific wikis.

Adding Bug #14950 since what you are asking for here would require global preferences.

I think what he wants is *exactly* Global preferences (bug 14950). I wouldn't consider MediaWiki as having many privacy options. He seems to see many. Perhaps it's just the "Allow other people to send me emails"?

  • This bug has been marked as a duplicate of bug 14950 ***

This bug is not about preferences. This bug is about the records created automatically on a wiki when an existing user visits a wiki.
*reading* a wiki page should not create an audit-able event.

Yes, but when just visiting a wiki page, it creates immediately an account, and I have received several times emails notifying me that a "Hello, new user" was posted immediately by some bots running on this wiki, noting the account creation. This should have never happened, and I should have never been contacted directly without first creating my own page, and verifying the user settings (after changing the user language preference, because the default language of the visited wiki was unreadable for me).
I had to manually unsubscribe all these notificitions (and from some wikis, that I had just visited for exampel to check a link, those notifications were really excessive, and completley unreadable for me).
In other words, the account creation was effectively auditable after a mere visit *reading* only some page (in fact just deciphering it partly, possibly with the help of an online translator).
I have more than 250 wikis registered on my wikis, I cannot manage them all, and in fact I still don't know if some options in one of them does not match my effective preferences (each time I check some of them, they are too much permissive, more permissive than even my home wiki : this should have never happened, because I have NEVER opted in for these options on all these wikis). This has the same impact as spams: bulk undesired and unsollicitated contents posted to my mail box. And this severely degrades the image of Wikimedia sites (and can be a part of the problem occuring now, with people leaving Wikimedia or stopping collborating on it). Be careful, please provide the necessary tools to allow users audit their existing wikis to whcih they were subscribed without notice.

(In reply to comment #24)

This bug is not about preferences. This bug is about the records created
automatically on a wiki when an existing user visits a wiki.
*reading* a wiki page should not create an audit-able event.

Since MW now has $wgLogAutocreatedAccounts, it is possible to eliminate the privacy concern outlined in this bug that auto-created accounts introduce. If accounts are still being logged on Wikimedia properties, then that is a separate bug.

The only other concern mentioned in this bug would be solved with Global preferences.

  • This bug has been marked as a duplicate of bug 14950 ***

Unless you have read the otrs-wiki page which provides more info about this bug, please don't close it.

(In reply to comment #27)

Unless you have read the otrs-wiki page which provides more info about this
bug, please don't close it.

You mean the one that's not public so I can't read it?

(In reply to comment #27)

Unless you have read the otrs-wiki page which provides more info about this
bug, please don't close it.

We can only deal with the information in this bug report. Comment #4 summarizes the OTRS wiki, so that is all the information I have.

As all of the information that people have felt necessary to reproducing or describing the problem has been given in this report and it has all been dealt with, this bug should be closed.

Do not reopen this bug, but feel free to open a new bug with new information. That would be an appropriate action if a problem reported in OTRS has not been dealt with yet and you wanted to report that problem.

The problems contained in this bug report have been dealt with.

You can get an idea on OTRS wiki from the more verbose Comment #4.

The OTRS wiki page does not contain more information than Comment #4.
Sorry for the confusion – when I opened this bug almost two years ago, I somehow thought that most Devs would have access to the OTRS wiki.

(In reply to comment #29)

As all of the information that people have felt necessary to reproducing or
describing the problem has been given in this report and it has all been dealt
with, this bug should be closed.

Sorry, but how has it "all been dealt with"? A proper fix ("login & log local account creation only on write actions such as edits") has yet to be written.
Simply disabling the log has severe consequences for keeping out abusive usernames. I've asked for the opinion of other admins on German Wikipedia and they affirm it hinders their efforts and that there are already trolls taking advantage of the log being removed.
(Trolls create abusive usernames in small WMF wikis (low profile) and then they automatically create accounts on many popular wikis without there being any methods to trace them)

So it seems that what is needed is a private log for autocreated accounts, that can be administered from any other large wiki that has enough admins to detect them.

Given that accounts are unified by default in SUL now, any abusive username should be administrable from any wiki.
May be a new privilege could be created, allowing to view all new SUL accounts wherever they are created, so that even small wikis could be protected by admins working on larger wikis.

The main problem of course is how to detect abusive account names : a name could be abusive in an unexpected language, and not at all in another where the account was effectively created. So instead of just denying the creation of the SUL account (not necessarily abusive from the view of the smaller wiki, it could be denied on each wiki where admins are working. After several wikis have rejected the creation, using their own local filters, the SUL account could be deleted, and it would just remain the non-abusive account on the local wiki where it was created).

It would also be interesting, if a new SUL account is detected in one major wiki, that a list of potentially abusive account names be posted for review on all other wikis where this user has also been connected, and each time he connects to a new wiki. This list could be automatically updated on each wiki, and could be accessible to specific admins (admins of the local wiki, and global admins).

And anyway, if an account is not abusive on the source wiki where it was created, and as long as it has not been rejected there, the user could still appeal from the decision of being deined accesswith the same account on the other wikis.


This also reopens something that was highly wanted, but not implemented, in SUL when it was initially released: the possibility of merging under the same SUL account, several user names that are distinct in other wikis: this would mean that the SUL database should be able to store the effective user name used on each wiki.

And given that SUL will still continue to autocreate accounts when navigating, if an account is still valid on the origin wiki, the account names that are considered abusive on another wiki should not just be deleted, but just administratively blocked (but still reserved), and linked to another visible account name which could be autogenerated with a numeric pattern, possibly in another namespace like "UserId:12345" using only the existing account numeric id, and its talk page becoming "Talk UserId:12345" too (the link from the old account name would remain hidden in the list of references for the general public). Then if the user connects to the specific wiki where he has no account accepted username would be offered the option to assign a new acceptable name there.

Note that if SUL had implemented distinct usernames on separate wikis, we would have not needed to rename administratively so many existing accounts just to merge them (some unrelated accounts were forced renamed, and I think this was abusive, even if the unrelated user was not active since a few months : it has broken the needed references for tracking licences). The only valable rule to allow unifying several accounts on different wikis would have been to just check that they had a common email address, with a verification by email, to assert that the other account being unified into an existing SUL account from its home wiki, is effectively accessible by the person asking for this unification (so an email confirmation should be sent to both the email address of the account on the source wiki asking for unification, and the email address currently registered on the target wiki to unify, in the case they are different, because not all people will want to receive email notifications from distinct wikis on the same registered email address).

And given that we will no longer autocreate accounts with GET requests when just visiting a foreign wiki, if a user ever starts editing any page and his account is still not created, posting would still be possible and visible in the log within the autocreated "UserId:" namespace. Not everyone would even need to have a user name, registering a user name would be only an option in the user settings of each wiki.

Another note: the association between a user id (local on each wiki) and its visible name needs not being kept secret. When a user registers, he still has an immediately accepted numeric user id, and if he asks for a user name, "UserId:12345" would be an internal alias/synonym of "User:Username". Edit logs from that user could still associate all contributions under the same user id, even if he has changed his username once or several times, or if one of his user name was rejected locally (in that case, the publicly visible logs would replace the rejected/old username by the user id).

Another idea to complement the previous one: if a user name is rejected, the user should still be offered an option to logon that wiki; instead of using the username, he could as well login using the syntax "UserId:12345" with the same registered password. But anyway, if that user insists several times on each local wiki to regenerate a new abusive account name there, that user can still be blocked completely there (blocking the "Userid:12345" instead of just the account names locally linked to it). This action should generate an event to SUL admins, and all other wikis where local admins could also inspect what that user has done something, allowing each wiki to take their own decision, or if too many local wikis are complaining to SUL admins, all other accounts on small local wikis blocked as well by SUL admins (and local admins informed in their local log).

Alos, in the SUL accounts database, that tracks all wikis where a user has an account (autocreated or nor, with a local account name and local user id or just the local user id), there's no need to store an account name. The SUL database just needs its own local user id, that links the "SUL:UserId:xxxxx" to each "wiki:User:yyyyy" where it is a stable reference and that *possibly* links to its local current username.
When you are connecting globally on a given wiki with your local user name or user id, only that local user id is looked up in the SUL database, but the "SUL:UserId:xxxxx" could be stored in each local wiki in its local user database (for performance reason, to avoid this lookup).
Then the "SUL:UserId:xxxxx" entry in the SUL database contains the complete list of wikis with their local (and stable) "wiki:UserId:yyyyy" for each of them (no need to store local user names in the SUL database), allowing to perform the global logon on all of them without necessarily having the user to type the username for each wiki, given that those "wiki:UserId:yyyyy" are already local aliases for the local user name on each wiki.
SUL is just needed to allow using a single password, by conecting from any local wiki, but it's not necessary for all wikis to share the same username. Then it can be used to store global, default preferences, notably the home wiki which should be the first in the list of "wiki:userid:yyyy".

Your proposal is quite complex. Here's mine:
Put a "Login as <USERNAME>" Button on Special:Userlogin (or even on the edit screen) if appropriate. That button creates a local user account (as POST request and with a token, to be sure). Do not create an account without the user explicitly clicking on this button.
It'll cost the editor two clicks to create an account, I think that's acceptable.
It should be pretty easy to implement.

(In reply to comment #33)

Given that accounts are unified by default in SUL now, any abusive username
should be administrable from any wiki.
May be a new privilege could be created, allowing to view all new SUL accounts
wherever they are created, so that even small wikis could be protected by
admins working on larger wikis.

The main problem of course is how to detect abusive account names : a name
could be abusive in an unexpected language, and not at all in another where the
account was effectively created.

There's already #cvn-unifications on freenode. Problem is (and it seems we're agreeing on that one), that user names are usually targeted at a specific wiki and they are written in this wiki's language. It's difficult to tell whether a username is abusive, if it's in a language you don't speak. That's why we need local admins to search for those accounts.
Normally admins look out for abusive user names, and if they're really bad, they report them to stewards to get them locked & hidden globally. Which is actually a pretty efficient system and has proven to work (more or less).

It'll cost the editor two clicks to create an account, I think that's

acceptable.

It should be pretty easy to implement.

The same would be true with my proposal. Only usernames, not user accounts, really need to be rejected locally, even if there may be a system for global admins and selected local admins to monitor those accounts where they occur.

Yes there's an IRC channel to allow all those admins to cooperate, possibly taking common decisions, to reject the account name everywhere it was already used.
But there's no real need to block the account completely if, by accident only, it is found abusive only on a specific wiki.

My proposal can be implemented separately anyway and will NOT need more clicks for most users (the second click to ask for the local account name creation when editing/posting for the first time on that wiki would still be used with your solution), but the user would not even have to assign a user name on that specific wiki : he could just use the default "userid:xxxx" autogenerated there, and post there without being publicly logged by his private (possibly temporary) IP. This woudl also allow even those users to have their identity kept for licencing purpose (the local account woudl still be created even if it's just an anonymous "Userid:xxxx" without a local preferred user name).

My proposal would also fulfil the demand, notably by users in Asia or speaking a non-Latin-written language, to use a prefered Latin transliteration of their name in their home wiki (for example, many Chinese/Japanese/Arabic writers like to use their native name in their home wiki speaking their home language, but like to have their name correctly read/deciphered/understood in other wikis (notably on Commons and large Latin-written wikis). This is even part of their official national identity, and present on their passport (if they want to use these aliases for foreign countries): just allow distinct usernames on distinct wikis; stability and privacy is kept with the anonymous numeric user id specific to each local wiki.

(In reply to comment #33)

This also reopens something that was highly wanted, but not implemented, in SUL
when it was initially released: the possibility of merging under the same SUL
account, several user names that are distinct in other wikis: this would mean
that the SUL database should be able to store the effective user name used on
each wiki.

(In reply to comment #37)

My proposal would also fulfil the demand, notably by users in Asia or speaking
a non-Latin-written language, to use a prefered Latin transliteration of their
name in their home wiki (for example, many Chinese/Japanese/Arabic writers like
to use their native name in their home wiki speaking their home language, but
like to have their name correctly read/deciphered/understood in other wikis
(notably on Commons and large Latin-written wikis). This is even part of their
official national identity, and present on their passport (if they want to use
these aliases for foreign countries): just allow distinct usernames on distinct
wikis; stability and privacy is kept with the anonymous numeric user id
specific to each local wiki.

Multiple usernames merged in a single SUL account was never on the radar, but the database allows that, so such support could be added painlessly if there's consensus. That should be splitted to a different bug, though.

(In reply to comment #36)

Your proposal is quite complex. Here's mine:
Put a "Login as <USERNAME>" Button on Special:Userlogin (or even on the edit
screen) if appropriate. That button creates a local user account (as POST
request and with a token, to be sure). Do not create an account without the
user explicitly clicking on this button.
It'll cost the editor two clicks to create an account, I think that's
acceptable.
It should be pretty easy to implement.

That's not a problem at all. The issue is that we have now a large userbase which get automatically logged in everywhere (more or less) on wikipedia. Having them to do some clicks to login may be hard. Albeit I'd like to do the change.

(In reply to comment #31)

(In reply to comment #29)

As all of the information that people have felt necessary to reproducing or
describing the problem has been given in this report and it has all been
dealt with, this bug should be closed.

Sorry, but how has it "all been dealt with"? A proper fix ("login & log local
account creation only on write actions such as edits")

That is one of many fixes. Consider the title of the bug: "Auto account creation creates privacy vulnerability". The description of this privacy vulnerability then says that it comes from logging the auto account creation.

The solution chosen kept auto account creation but made it so it wasn't logged.

What you describe is a new problem:

Simply disabling the log has severe consequences for keeping out abusive
usernames. I've asked for the opinion of other admins on German Wikipedia and
they affirm it hinders their efforts and that there are already trolls taking
advantage of the log being removed.

Please use Bug #28227 to discuss this new problem. If you think the only proper fix is "local account creation should not happen on HTTP GETs", then that, too, is a new bug.

You may not like the solution that was provided for this bug, but re-opening it is not likely to change CentralAuth to your liking. This bug was about a privacy concern, and that has been addressed.

(In reply to comment #40)

You may not like the solution that was provided for this bug, but re-opening it
is not likely to change CentralAuth to your liking. This bug was about a
privacy concern, and that has been addressed.

The solution was bad. It should be reverted, and this bug reopened pending a *real* solution.

content hidden as private in Bugzilla

This hasn't solved the problem described. It has been hidden, poorly.

I have randomly picked a new autocreated user, Nane2011, to demonstrate.

Whereas before the privacy vuln required looking at the user creation log, which is now blank

https://secure.wikimedia.org/wikipedia/en/w/index.php?title=Special:Log&type=newusers&user=Nane2011

Now, the exact same information is visible here:

https://secure.wikimedia.org/wikipedia/en/w/index.php?title=Special%3AListUsers&username=Nane2011&group=&limit=1

The information is also available at:

http://toolserver.org/~vvv/sulutil.php?user=Nane2011

And it is distributed to the toolserver.

Moreover, hiding the new user creation entry is _not_ the solution. That just hides it, and then the wiki database contains information which it needs to keep hidden in order to protect the privacy of its users. Any number of oversights/screw ups with data management can result in accidental release of this information. Avoid creating data that you want to remain hidden.

And if you are going to continue with this 'hiding' approach, please revert the current 'fix' until it has been properly built and implemented, otherwise users have the false expectation that this information is hidden.

I've opened Bug #28366 and Bug #28367 to address the issues you've raised.

If you want the current solution completely reverted, I suggest you open a new bug to that effect. I recommend you look at Bug #542 (fixed, closed) and later bugs like Bug #27527, Bug #28182, etc for an example of how to log issues of problems that fixing a bug creates.

Even if you disagree with the fix and think it is the wrong thing to do, reopening this bug is not appropriate. The fix is done. It has been applied and deployed. It is appropriate to *file a new bug* if you think the old behavior should be restored. Re-opening an old bug is not the correct way to address the current situation.

Avoid creating data that you want to remain hidden.

If you think that is what is needed, I suggest opening a new bug to that effect.

(In reply to comment #44)

I've opened Bug #28366 and Bug #28367 to address the issues you've raised.

I've resolved both of those bugs as invalid.

If you want the current solution completely reverted, I suggest you open a new
bug to that effect. I recommend you look at Bug #542 (fixed, closed) and later
bugs like Bug #27527, Bug #28182, etc for an example of how to log issues of
problems that fixing a bug creates.

No, this is the bug tracking this problem. If this bug hasn't been adequately addressed, it should remain open.

Even if you disagree with the fix and think it is the wrong thing to do,
reopening this bug is not appropriate. The fix is done. It has been applied
and deployed. It is appropriate to *file a new bug* if you think the old
behavior should be restored. Re-opening an old bug is not the correct way to
address the current situation.

Your assumption that "the fix is done" is flawed. The fix isn't done because it didn't resolve this bug adequately. This bug needs to remain open until this issue is properly addressed.

Avoid creating data that you want to remain hidden.

If you think that is what is needed, I suggest opening a new bug to that
effect.

This bug is about limiting what data is propagated and in what circumstances. There's no need for a separate bug.

Re-opening this bug.

(after edit-conflict, post is directed at comment 44)
Facts:
a) This bug report is about a privacy vulnerability
b) The "fix" fails to solve a)
c) The "fix" has annoying consequences

I agree that c) might not be sufficient to reopen this bug, but b) most certainly is.
Note that Comment 43 brought a completely new aspect into the discussion – it's not anymore about inconvenient side-effects, it's that the "fix" fails to do what it claims (i.e. removing the privacy vulnerability).

Reverted the change in r85128.

The wikimedia implementation/use of the mediawiki software clearly VIOLATES the wikimedia foundation privacy policy, which says:


Reading projects: No more information on users and other visitors reading pages is collected than is typically collected in server logs by web sites. Aside from the above raw log data collected for general purposes, page visits do not expose a visitor's identity publicly. Sampled raw log data may include the IP address of any user, but it is not reproduced publicly.


Contrary to the policy, READING a new wiki is PUBLICLY exposed (on those wikis where a bot reads new user accounts and creates welcome pages) when a user has previously signed in to another wikimedia wiki.

I do not consider the option to disable single-sign as a practical solution. A single sign-on to commons, en.wikipedia, de.wikipedia, perhaps mediawiki.org may be desired, but recording chance read-only-visits at any number of other wikis would still be surprising.

Solutions:
a) fix the software so that user creation occurs only after an edit
b) change the privacy policy to read:

Reading projects: No more information on users and other visitors reading pages is collected than is typically collected in server logs by web sites. However, when using single-sign on, any visit to a wikimedia site you had not previously visited may be recorded publicly. Aside from the above raw log data collected for general purposes, page visits do not expose a visitor's identity publicly. Sampled raw log data may include the IP address of any user, but it is not reproduced publicly.

c) prohibit bots that create welcome pages after user creation without corresponding edits.

Gregor

Creating an account is not a read-only action.

I understand that visiting other wikis creates a potentially annoying effect of being tracked, but if arguments raised in this bug to remove logging are valid, they also apply to a purely local user registration. If somebody registers on a standalone wiki solely for the purpose to set preferences, why should their *first* registration be logged at all, too?

Consequently, [[Special:ListUsers]] would become something like [[Special:ListEditors]] and [[Special:Log/newusers]] should be removed. I don't think that this is intended.

If you are really interested in finding out if a known global user did ever visit some particular wiki, you can go to [[User:username]] and you will be told if the local user exists or not. Trying to fix all possible information disclosure channels about created-but-otherwise-inactive account is probably not feasible at all.

The only concern that remains is the *timing* issue, i.e. that somebody would be able to corelate unsuspecting logged-in wiki user using some third-party network service. It is an equivalent of seeing over the shoulder that somebody registers on any MediaWiki site and later going back to that MediaWiki installation to check timestamps of the new user log. SUL automatic creation is only way easier (one click) and we have many many wikis to try the trick.

If this is indeed a problem, the only reasonable solution to this is a confirmation box (similar to what we have when trying to purge the webpage anonymously) to ask for confirmation to add a local instance of a SUL account.
But that would be a usability nightmare for many I guess, with a confusing, Checkpoint-Charlie style message that will be very hard to get right and even harder to translate - "You are leaving the friendlier part of the Wikimedia cluster. Clicking on this button will create a copy of your account on this wiki instance. This fact will be logged publicly with your username. Do you agree?"
or "Click button number two to continue browsing using you IP address, which, by they way, you will reveal by editing something here."

To me it looks worse than those dreaded "you are leaving this webstite and entering unfriendly Internet" messages that some sites use for external links.

I confirm that this is still an issue, after I unepectedly staring to receive "welcome" emails created by bots scanning the list of "new users", just because I just visited one page on that wiki written in a language that I absolutely cannot read (it was in the Armenian Wikipedia), just by following an interwiki link only to see if the target page was existing.

I was NOT given any chance to set my preferences first, my account was created *without* any notice, and their users (and bots) were allowed to send me any number of emails (directly or by writing to my user page), without my prior authorization. This was very bad, because I could not even decipher the contents of this email (but even if I could, the unlimited automatic authorization was NEVER explicitly granted).

For me, any user account that is created automatically by a simple SUL-connected account should not be considered a "new" registered user it should have a different status.

The status should become a true "new registered user" only when the user will either :

  • (1) visit his own "User Preferences" page (and confirmed the registration by STORING the changes after first defining his prefered language, and then found and set the email email options), or
  • (2) starts creating and editing a page on the wiki (including his own user page, where he will have the possibility to explain on which wiki his main user page is, and which languages he can at least read) and RECORDS his changes in that page.

A mere visit of any new wiki without storing any info should NEVER be logged anywhere in any publicly visible area.

So in summary, what I want is a new user status: "autocreated user account". which will be turned into "registered user account" only when the user will actually STORE something on that wiki or in his preferences.

And as long as a user account status is "autocreated":

  • *NOBODY* (not even local admins without asking to a global administrator that will decide themselves what to do with unused user accounts with the "autocreated" status, that someone else would like to use for himself) can use the function to send emails to that user using the wiki as a gateway. This should behave as if the user did not exist for now (even if nobody else can start creating a new account with that same user name).
  • if someone else writes to the user's talk page, *EVERYTHING* posted will *NEVER* be notified to the user by ANY emails coming from that wiki (even if the user sees that there's something in his user page, and visits it, without answering or modifiying it).

Simple, effective, and does not require any complex mechanisms with the SUL account. I just want that SUL only creates account automatically with this new status. No confirmation is needed, users won't be alarmed.

This basic security can be implemented locally, it will not hurt server's performances on that local wiki, it will just deny some attempts made by others to use unconfirmed functions (notably sending any kind of emails).

This basic security is specially critical for small wikis in languages spoken by a very small community, because these wikis are not enough administrated, and some commercial spammers are also attempting to use this lack of administration to contact a lot of known Wikipedians registered and active on other wikis : they don't need to write to them on these active wikis, they jsut have to find a minority wiki to attempt to write to their user page there, or to use the "email to user" function of that wiki.

Enough said, I think everybody understands what this is about.
I would recommend re-opening this bug if there is a major change to the way SUL works or somebody actually proposes some code to fix this.

Browsing a website should not create an account. Automatically creating accounts should only be done, and logged, with the users permission.

(In reply to comment #54)

Automatically creating
accounts should only be done, and logged, with the users permission.

I think that part is sensible and can be relatively easily implemented. If a user as a SUL account, there is an existing condition in the code where the wiki presently creates a user account. Insert some dialogue here, asking whether to create an account on this wiki. If you say no, you cannot edit or set preferences.

I would not mind being asked this every time. If I visit a wiki regularly, I either can bite the bullet of inconvenience (I would have little motivation to do so) or I just say OK. The really annyoing thing is, as described, to be registered for chance visits and even receiving notifications by mail in languages one does not even read.

What I propose is simpler: you don't need the dialog. Because users will have to open their user preferences to register the permissions.

Eventually, you would get a dialog, asking if you want to really edit a page before setting your preferences. If you say no, the edits will still be logged like anonymous users, and not associated to your account whose user preferences have not been set.

Note that when being created as "autocreated user account", all permissions (notably emails) are set to NO for almost everyone (except a few selected global admins for handling very specific cases related to security and abuses).

That's why you don't need the permission dialog when just visiting a new wiki by following an interwiki link (possibly even not the one that was expected, for example by clicking the wrong language). Those permissions will only pass to YES after the user has opened his preferences and SAVED them to CONFIRM those permissions, effectively changing the status only at that time to a really registered (and logged) user.

However there are good reasons for automatically creating accounts: this just serves to reserve the user name, for later use.

This bug is becoming more and more useless.
As we don't even agree that what outlined in comment 0 and so on is a severe vulnerability, not to speak of solutions, it's perhaps better to agree on what could reasonably be done and then decide to do it or not.

(In reply to comment #50)

The status should become a true "new registered user" only when the user will
either :

  • (1) visit his own "User Preferences" page (and confirmed the registration by

STORING the changes after first defining his prefered language, and then found
and set the email email options), [...]

Please, let's keep things "simple".
The autocreation is triggered by autologin, so to avoid the former it would be enough to disable the latter. This is probably technically inaccurate, I hope you can forgive me.
I changed the summary to: «Don't autologin if local account doesn't exist (don't autocreate if user doesn't explicitly login)». (Bug 16864 refers to a more specific situation.

If you've never visited a wiki before, you'd need to click the login button to get your local account autocreated and login. This is what already happens on wikis where autologin doesn't work (bug 14407) and it has some problems:

  1. you need to remember whether you've visited the wiki before to understand what's going on/remember to login,
  2. when you get to the login page, you have to know that you can just login and don't need to register;

so a solution for both or at least (2) should be found.

This seems to be what the bug originally requested:

(In reply to comment #0)

Therefore I propose to disable automatic account creation on GET-requests and
instead use only POST-requests to create accounts.

Previously (for a extended period in 1.18 when it was live on the cluster) we had a check box on the login page so you could choose to not be globally logged in (thus user accounts didn't get extended to involve other projects when you visited them).

{To my knowledge/that I saw}, Only one person complained when that was removed (and they aren't on this bug AFAIK).

This option based on the ramblings in this bug would seem to have fix all your issues, but since people didn't speak up on it's removal…

That feature was requested, and later coded by me. Yes, I still consider it valueble (despite not being 'pretty', as already discussed elsewhere).
Sure, few people complained about its removal but don't many people argued to it, either, so that's not a strong reason.

Going back to autocreation, it'd be really easy to stop account autocreation, and instead require a manual clic on login. I think there's even a configuration variable for that.
But then, there would be lots of complaints about edits done as IP because "autocreation isn't working" (which would be worse than this bug). We can reduce it by disabling only autocreation, while keeping global login to sites where you already have an account.
Nonetheless, that would still reveal a number of users IPs. And it would also damage, for instance, the experience of switching to Wikimedia Commons from your home wiki.

Philippe Verdy, can you design an appropiate interface for that?

(In reply to comment #59)

Going back to autocreation, it'd be really easy to stop account autocreation,
and instead require a manual click on login. I think there's even a
configuration variable for that.

Would this be a reasonable use-case for a global user preference? Something like "[ ] auto-create accounts for me"?

(In reply to comment #59)

"autocreation isn't working" (which would be worse than this bug). We can
reduce it by disabling only autocreation, while keeping global login to sites
where you already have an account.

I think this is what needed. The previously discussed option to remove global login (single-sign-on) is not a valid. If I have accounts on multiple wikis, most people likely do want to remain signed in in all. Especially to shared repositories.

Nonetheless, that would still reveal a number of users IPs. And it would also
damage, for instance, the experience of switching to Wikimedia Commons from
your home wiki.

Could Commons be configured to be an exception? I can think of two modes: (a) The WMF privacy statement explain that commons is always included, and the software keeps the present behavior for commons (b) In the case of commons autocreation is offered but needs confirmation. I prefer (a).

In the case of other wikis I don't care much whether the option (b) or (c): require a manual login the first time to create account is chosen.

Please note that part of this bug is legal: The privacy statement of WMF contradicts the behavior of the software, the software is violating the privacy rules explained to users and therefore a threat of some troll taking legal action against WMF exists.

(In reply to comment #57)

This bug is becoming more and more useless.
As we don't even agree that what outlined in comment 0 and so on is a severe
vulnerability, not to speak of solutions, it's perhaps better to agree on what
could reasonably be done and then decide to do it or not.

(In reply to comment #50)

The status should become a true "new registered user" only when the user will
either :

  • (1) visit his own "User Preferences" page (and confirmed the registration by

STORING the changes after first defining his prefered language, and then found
and set the email email options), [...]

Please, let's keep things "simple".

IT IS REALLY SIMPLE (for users). They don't have to manually login, they are logged in invsibily, their account is created automatically, BUT this is reported nowhere (the creation is NOT logged).

The user can immediately create his page or start editing or change his preferences if he wishes. BUT the full creation (which will be logged) is delayed until the user really SAVES something on the wiki.This requires NO additional action by the user.

I maintain that my solution is the simplest, and offers all the benefits. Except that the "autocreated user account" status blocks all interactions using private emails, or every other actions by others that the user should have been given a chance to define in his preferences.

NO OTHER action or option is needed in the User Preferences dialog. The user just has to save his settings to change its status automatically (almost invisibly) from "autocreated" to "registered".

If the user wants to edit a page (including his own page) while he is logged on but still with the "autocreated" status, only at this time there could be a dialog informing that default options will be set in his account, so the user should be given a chance to define his OWN settings, not those settings that are preselected by default (notably not the preselected email options).

In other words, nothing changes for the user just visiting a new wiki. His privacy is completely protected, the "autocreated" user account will NOT even appear in the list of "registered" users.

All the advantages of SUL, without the bad !!!

(In reply to comment #59)

That feature was requested, and later coded by me. Yes, I still consider it
valueble (despite not being 'pretty', as already discussed elsewhere).
Sure, few people complained about its removal but don't many people argued to
it, either, so that's not a strong reason.

Going back to autocreation, it'd be really easy to stop account autocreation,
and instead require a manual clic on login. I think there's even a
configuration variable for that.
But then, there would be lots of complaints about edits done as IP because
"autocreation isn't working" (which would be worse than this bug). We can
reduce it by disabling only autocreation, while keeping global login to sites
where you already have an account.
Nonetheless, that would still reveal a number of users IPs. And it would also
damage, for instance, the experience of switching to Wikimedia Commons from
your home wiki.

Philippe Verdy, can you design an appropiate interface for that?

No need of a new interface. SUL continues to create accounts like now. No new dialog is needed (except if an with "autocreated" account status wants to edit a page, in which case it would be helpful to have a dialog asking the user to first define and SAVE his preferences). No now button.

Eventually, the icon at the top of page that is displayed beside the user name would be different (such as a man with black glasses and a hat, meaning that he is invisible to everyone, and that he is visiting the wiki privately).

Note: No need to hide the user in server logs (admins with "checkuser" can still see all actions performed by that user, including simple visits).

The user will NOT be logged publicly. Because saving any page or saving the preference will automatically convert the status to "registered", there's no risk for that user to appear accidently in any history of any wiki page (the history just records SAVES made by a "registered" user, there will not be any user with the "autocreated" status).

If the user wants to save a page and remain "anonymous", he can logoff first, and his page edits will be saved under his IP, like now.

I repeat: nothing is changed for users, except that nobody will ever attempt to send him any email, because this permission to send mail to a "autocreated" account is granted to NOBODY.

The user keeps the FULL control of WHEN he will (or will not) start editing something and being logged under his account name for his edits.

As long as the user is visiting, and just visiting, he is invisible to anyone (except to "checkuser admins" that can see his actions in the server log: "checkuser admins" must still not attmpt to write any email to that user, they would however need to see these logs in order to identify a user that behaves badly on another wiki, or that abuses the service by too much automated traffic of viewing requests; this would likely include "Bot" users visiting other websites with their autocreated account; this case would be exceptional : bots don't normally don't visit wikis by accident, but they may do that only to see if a page exists there, and "chekuser" admins can see these invisible users and associate them to a user account registered on another wiki).

Nothing is changed in SUL itself: when a wiki sees that the visiting user is not logged, it performs a SUL check (looking at its global SUL cookie), and then retrieves the user name from SUL. it autocreates the account like now. But the only thing changed is that the wiki creates the account with different *initial* status (all private options are disabled, including all emails). This initial status is not definitive, and in fact, it is not even needed to record any default value in that account (those defaults will be set when the user first visits his preferences page, but will become effective only if the user SAVES these default settings).

There are a couple options, each pretty bad:

  • don't do anything
  • disable CentralAuth autocreation completely
    • very poor UX
  • disable CentralAuth autocreation on normal pages, but do it when the user clicks on login
    • somewhat poor UX
    • will break cross-wiki things that rely on autocreation (e.g. OAuth authorization)
  • try to fake that the user is logged in but only autocreate when needed
    • seems very hard to determine when it is needed (POST requests?); liable to cause hard-to-debug bugs when we fail to autocreate in time
  • disable/delay autocreation logs
    • does not seem useful. The attacker could just get a list of all users before/after the attack and compare.
  • autocreate users everywhere as soon as they are registered
    • very noisy, bug DB load, makes user list essentially useless for small projects (although maybe it already is?)

You forgot the last option I proposed: autocreate an account but with a different initial status with no other preferences than the default preferences for the wiki. With this status, the user remains invisible, will not receive emails or personal notifications until he starts editing something or sets its own preferences.
If the user starts editing, its edits will be regularly logged with its personal user account, but still its preferences are to disable emails.

But he can be contacted on its talk page (making any edit will change the user status to a normal user with a personal account and all its default preferences; direct contact by emails should still not be enabled; just after saving his first edit, the user account creation will be logged, allowing biots to detect the user creation and send welcomes to his talk page).

A user with such initial (autocreated) status should not be listed in lists of users of the wiki. Only admins should see him. Because that user has not performed any wiki, he also does not need to be logged publicly with his IP address. Privacy is preserved.

And there's no new UI. The user sees that he's logged in, with the top bar displaying his own name and going to his own preferences page. And there's no "Login" button, but the user can still logout (from all wikis with SUL).

Ideally, on the Users preferences page or when the user starts editing any page, the edit screen should say that his account name will be made visible to others.

Optionally, on the preferences page (but not when saving edits) a checkbox could be added to inform users that his autocreated account will be come visible (a user could set preferences without affecting any other page on the wiki, such as selecting another UI language or selecting gadgets, or display preferences).

In summary, all what is needed is a new user status : "autocreated", that will be automatically changed to a standard user status when the user starts editing, but that will not change as long as he is only visiting a wiki or setting preferences that don't affect any page (not even his user page or talk page, which may already exist if it was created/edited by someone else).

What is the status of this? It's frustrating that whenever I click link (mostly accidently from phab) I am creating useless random accounts everywhere.

Aklapper changed the subtype of this task from "Task" to "Feature Request".Feb 4 2022, 12:24 PM
Aklapper removed subscribers: Tbayer, wikibugs-l-list.

@Xaosflux is Community Wishlist Survey 2023/Utility to attach acccount to all wikis motivated by this problem? I'd understand not wanting to say that in public, but as it stands now that proposal makes little sense.

@Tgr The wishlist item is mostly a way for people to purposefully attach to all of the wikis, with one noted benefit that it would start the auto confirmation clock for them, most users wouldn't do that - only those that actually have a use for massive numbers of projects. If this request got done, it could also be useful for such users - but wasn't the primary motivator.

If this request were actually done as summarized above I think it is a little problematic as described: The possible problem of relying on only a 'post' is that if you go to a SUL wiki that you haven't before, it would be risky to initiate a POST and hope that this worked so that your contribution gets logged to your account instead of your IP - because if it fails you end up exposing your IP, and you won't be able to tell if it will fail until it is too late.

A slightly different option would be 'don't autocreate' unless someone clicks login on a new wiki, which could safely generate the call. Certain exceptions should likely be made for that though - notably commonswiki and metawiki. (Some sort of autocreate-on-get site list perhaps?)

A slightly different option would be 'don't autocreate' unless someone clicks login on a new wiki, which could safely generate the call. Certain exceptions should likely be made for that though - notably commonswiki and metawiki. (Some sort of autocreate-on-get site list perhaps?)

Yeah that was

  • disable CentralAuth autocreation on normal pages, but do it when the user clicks on login
    • somewhat poor UX
    • will break cross-wiki things that rely on autocreation (e.g. OAuth authorization)

Another option would be to introduce a "hidden" flag and hide the user's name until their first action. (The infrastructure is largely already there via User::isHidden() although it would make lists etc. very spammy.) This is not super useful for logs since it gets revealed once the user does something on the wiki. But it would take care of pretty much everything else - ListUsers, user existence checks etc. We already treat it as a low-key security issue when the existence of a hidden user can be established. And the logs could just be hidden from non-privileged users, or we could not log the actual account creation and make a fake autocreation log entry the first time the user does something public.

There are also some very old notes from @brion at https://www.mediawiki.org/wiki/Extension:CentralAuth/Shadow_users although I think that would be a fairly invasive change, even now that we are done with actor migration.

I do only support option 4: try to fake that the user is logged in but only autocreate when needed. As a detail, I describe "passive account" way as follow:

Phase 1: CentralAuth based user lookup: e.g. There is not a local account for https://www.wikifunctions.org/wiki/Special:Contributions/Jimbo_Wales. So we lookup the username in CentralAuth and treat it as existed if it can be found in CentralAuth (and for normal user, is not hidden or suppressed). This will make global user page work for users without local account.

Phase 2: T354487: Make CentralAuth auto-creation non-blocking, i.e. unrestrainable from local blocks and AbuseFilters.

Phase 3: Introduce CentralAuth-only session, treat users with such session (without requiring local accounts) as logged in and able to generate (edit) token, and auto-create an account only if it is needed (such as editing).

Phase 4 (optional): Since accounts are no longer autocreated on visit, we should use CentralAuth registration date instead of local account registration date to determine autoconfirmed status. This will make most users autoconfirmed once on auto-creation, so we may change $wgAutoConfirmCount to 1 globally (which require a global RFC).

There are two further proposed steps, only my crazy ideas, not directly related to the goal of this task but is the second step that help small wikis reduces database footprint (cf T352823: Split user table to multiple tables as the first step):

Phase 5: Allow MediaWiki account be deleted: T34815#9437885. Note if the user did nothing in local wiki, this means nothing different to the user, since local accounts can be automatically recreated when needed.

Phase 6: Run a maintanence script to found all accounts without local activity and delete them. This will also make Special:CentralAuth of existing wikis significantly shorter.

How do you propose to know "if the user did nothing in local wiki"?

An auto created account is empty on creation, in the sense that it has zero edits, no log (other than auto-creation), no explicit user group, no block history, etc. Such accounts can safely be removed, since a new one will be created automatically when needed.

@Bugreporter an account can be in use without generating any of those things via private actions such as configuring preferences and maintaining watchlists.

@Bugreporter the key part of your proposal is

Phase 3: Introduce CentralAuth-only session, treat users with such session (without requiring local accounts) as logged in and able to generate (edit) token, and auto-create an account only if it is needed (such as editing).

which is a pretty invasive change. How do you apply global permissions? Global blocks? Global preferences? How do you internally signal that for the purposes of local preferences, the user is in user? How do you handle notifications (it's hard to trigger notifications without doing a logged action, but not guaranteed to be impossible)?

Technically none of these things depend on the existence of a local user account, but in practice most of the code would have to be rewritten.

So this is why my proposal is called "passive account" - From the user's perspective, this is treated as if we auto-created a local account (user is considered logged in, global group will apply, etc.). When a user do an edit or make a log record (for now, also including watch a page or change the perference, as long as they record a local user ID), we automatically create a local account, but merely visit a wiki does not create one.

@Bugreporter an account can be in use without generating any of those things via private actions such as configuring preferences and maintaining watchlists.

As I note in previous comment, these will leave some discoverable record in database. Users with no edits or logs but interact with site in some other way can be deleted (with the risk/consequence of erasing watchlist, perference, blocks and user groups), but should not be auto deleted in Phase 6.

Anyway, Phase 5 and 6 are only my additional proposal, not part of goal of this task.