Page MenuHomePhabricator

Add configuration setting to add a preference checkbox for sending users copies of pages on their watchlists that are deleted
Closed, DeclinedPublic

Description

Author: tisane2718

Description:
This is a proposal for an extension to send users (at their option) copies of pages on their watchlists that are deleted. I plan to write it, but first I am seeking suggestions from any interested parties as to what specifications should be implemented.

The goal of this extension is to enable users to obtain the same result as if they had copied and pasted the wikitext of the most recent revision, and done an export of all revisions, just prior to the article deletion. Users may not always know that a sysop is about to delete an article, or they may for whatever other reason not be able to get the revision data they need out of the system before the deletion occurs, so this will take care of that for them. Having this data on hand could be helpful in deletion reviews or if a user wishes to transwiki the entire page history to a more suitable wiki.

This will eliminate the need for those users who had watchlisted the article to ask a sysop to provide a copy of the deleted revisions. It will avoid all or most of the problems associated with allowing users to view deleted articles, which retaining many of the benefits. Here are proposed specifications:

  1. The user will have, in the "E-mail options" section of Preferences, a toggle (checkbox) for "Send me copies of pages on my watchlist that are deleted." It will default to false.
  2. Deleting sysops will have a checkbox allowing them to suppress emailing of the article. The sysop might, for example, toggle this to true if the article is being deleted for copyright violations, and the sysop wishes to minimize its dissemination from the wiki. The checkbox will default to false.
  3. If the toggle mentioned in (1) is set to true, and the checkbox mentioned in (2) is set to false, and a page that is on the user's watchlist is deleted, then the user will be sent an email whose contents will be the wikitext for the most recent revision of that page, and which will have an attachment containing an XML file of all non-suppressed revisions. This file will be suitable for importing into another MediaWiki installation. This file will not include revisions placed in the archive table during prior deletions of the article.

I'm thinking of using an ArticleDeleteComplete hook function to grab the revisions from the archive table to create an XML file, perhaps using some modified functions from Export.php. Any other thoughts?


Version: unspecified
Severity: enhancement

Details

Reference
bz38642

Event Timeline

bzimport raised the priority of this task from to Low.Nov 22 2014, 12:58 AM
bzimport set Reference to bz38642.

One question is which revision of the page should be emailed to the user. Let's assume two scenarios:

  1. Only the creator of a page can have it emailed to him
  2. Anyone who watchlists a page can have it emailed to him

Suppose option 1 is used. Should he get the most recent revision? That's usually the most useful one, but it could differ significantly from what he last contributed.

Another possibility is to have a emailed_revision table that would have a record of who has been sent what revisions. It would have at least three fields: er_id, er_user, and er_revid. This table could be used to prevent users from receiving repetitive notifications of pages being deleted if the pages were to be deleted and re-created. Especially if we're going to use option #2 from comment 1 above, it would be good to keep down the number of unnecessary emails.

Okay, https://www.mediawiki.org/wiki/Extension:EmailDeletedPages is done. Regrettably, I inadvertently duplicated functionality:

Sigh, now I need to go back and fix that, and discard a bunch of code. By the way, what do you think the level of support would be for implementing this as core functionality, with a config setting to turn it on or off (kinda like what we do with https://www.mediawiki.org/wiki/Manual:$wgEnotifWatchlist )?

(In reply to comment #3)
\

Sigh, now I need to go back and fix that, and discard a bunch of code. By the
way, what do you think the level of support would be for implementing this as
core functionality, with a config setting to turn it on or off (kinda like
what
we do with https://www.mediawiki.org/wiki/Manual:$wgEnotifWatchlist )?

Personally I think this is the sort of thing that is better suited to an extension. However I tend to lean that way on most things and often people disagree with me...

(In reply to comment #4)

Personally I think this is the sort of thing that is better suited to an
extension. However I tend to lean that way on most things and often people
disagree with me...

Perhaps you would like to weigh in at https://www.mediawiki.org/wiki/Schools_of_thought_concerning_integration_of_extensions_into_the_core with your own school of thought. I'm not sure, but I think it might have required changes to the core to implement cleanly and without a much of code duplication anyway. I have submitted a patch that implements it entirely as a core feature.

What do you think the demand would be for the feature? My thought is that it is somewhere between the status quo and the semi-deletion (aka pure wiki deletion) that is sometimes proposed. Users, especially newbies and people who create a lot of new articles, often get annoyed at their pages getting speedily deleted; it's inconvenient to have to request that a sysop provide a copy.

I'm going to ask around at #wikipedia to see whether people would prefer to have a wl_del_notificationtimestamp or wl_del_notificationrevid. The former is what I coded; it causes the user to be emailed a maximum of one revision text unless he returns to the page to clear the field. The latter would allow users to be emailed an unlimited number of times, if the page were to be deleted, then a new one created (with a different revision ID), then deleted and recreated again, etc. It would, however, prevent the user from being emailed the same revision text (with the same revision ID) more than once.

What do you think the demand would be for the feature? My thought is that it
is
somewhere between the status quo and the semi-deletion (aka pure wiki
deletion)
that is sometimes proposed. Users, especially newbies and people who create a
lot of new articles, often get annoyed at their pages getting speedily
deleted;
it's inconvenient to have to request that a sysop provide a copy.

From a Wikipedia perspective, I would be worried about things that are deleted because we want them gone (e.g. Things that borderline should have been oversighted.) I'm not sure we really would want such things emailed around.

(In reply to comment #6)

From a Wikipedia perspective, I would be worried about things that are
deleted
because we want them gone (e.g. Things that borderline should have been
oversighted.) I'm not sure we really would want such things emailed around.

I guess it depends on how you (and the law, and the WMF) look at it. Do we consider Wikipedia to be like a newspaper or book, whose publisher is responsible for what goes out? Or do we consider it to be like an email service, where the owner is not all that responsible for what goes out?

Gmail, for instance, imposes virtually no standards on who can sign up for an account and start sending emails out. Those emails could contain all sorts of libelous and copyrighted material. Should we blame Gmail?

It's somewhat the same way with Wikipedia. The users sign up without any screening process, and the sysops are chosen by the community in a process managed by bureaucrats who are also chosen by the community. "The community" is not the WMF; the board of trustees and the community are two separate entities, and you can't sue the WMF for what the community does, can you?

So, if a sysop acting on behalf of the community causes an email to be sent out, is that much different than a Gmail user causing an email to be sent out? All of the content originated in the community, by users acting on their own initiative and without edits being cleared in advance by WMF staff; none of the content was created or specifically authorized by WMF. WMF only provided the resources, in the same way that Gmail does.

The difference would be, if there's an abuse report and WMF feels the need to intervene to remove a sysop, it's a lot harder to start from scratch and become a sysop again than it would be to start a new Gmail account. So there is more accountability, in that respect. Maybe in that sense, Wikipedia is more like a newspaper or book.

If stuff needs to be oversighted, shouldn't it be oversighted, rather than merely deleted? If this is a concern, then maybe it's a sign that we need more oversighters to handle the workload, because we're improperly deleting material that should have been oversighted. Whether this change would be a net gain or loss for transparency, I'm not sure. There are inherent problems with holding oversighters accountability because of the fact that their whole role is to limit transparency -- and transparency is usually a prerequisite to accountability. We see the same problem arising with the United States Foreign Intelligence Surveillance Court.

Currently, people say "It's no big deal that you can't view deleted articles, because sysops will provide copies upon request." If that's the case, then what is the harm of automating the emailing process? So sysops actually exercise much caution or restraint in deciding what requests for emailing deleted articles to grant?

With reference to the schema change -- it might actually be beneficial to email the user more than once, if there are deletions and recreations going on, since that would keep him apprised of what is getting deleted each time. How often does it happen that the same article gets deleted and undeleted, without intervening edits? I could see it happening on, say, RationalWiki, where almost everyone is a sysop. Whether anyone would mind getting all those repetitious emails, I'm not sure.

To be totally thorough about providing all the revisions, we would want to do an auto-export and attach an XML file, but I haven't yet figured out how to do that.

By "We" I meant Wikipedians might not want that (I guess I shouldn't use "we" there, as I'm not really a Wikipedian). Usually WMF holds itself to be a service provider with no responsibility for content. Anyways, I suppose that's more a social concern than a technical one and thus hard to judge without asking the people in question.

I don't know about enwikipedia, but in some communities its not uncommon for something to be deleted first, and then oversighted afterwards once an oversighter is found.

To be totally thorough about providing all the revisions, we would want to do
an auto-export and attach an XML file, but I haven't yet figured out how to do
that.

If a page has a lot of revisions, that might represent a performance problem for the server. Also folks might not want to receive multi-megabyte emails containing all the revisions of a page with a 1000 revisions.

If Deletionpedia or a successor site were to make available all the revisions, this would be mostly a moot point. There was going to be a real-time feed (bug 17450) to make it easier to implement such mirroring, but apparently that never happened.

Change 101443 had a related patch set uploaded by leucosticte:
Add preferences checkbox to email text of watched deleted pages

https://gerrit.wikimedia.org/r/101443

TTO lowered the priority of this task from Low to Lowest.
TTO set Security to None.
TTO subscribed.

Does anyone else actually think this should be implemented? I suspect we should just decline this as a personal feature request.

No-one else seems interested in this other than the reporter, who is globally banned by the WMF.