Page MenuHomePhabricator

Non-admins can see contents of deleted pages when viewing abusefilter details
Open, MediumPublic

Description

Healthy Children Project's Center for Breastfeeding was deleted on enwiki, however I can still see parts of the edit history (and therefore, article content) by looking at specific abuse filter log entries like Special:AbuseLog/7900305 even though I am a non-sysop, and shouldn't be able to see deleted pages.

I'm marking this as major since non-sysops shouldn't be able to see deleted pages.

Event Timeline

bzimport raised the priority of this task from to Medium.Nov 22 2014, 12:49 AM
bzimport added a project: AbuseFilter.
bzimport set Reference to bz42734.

We can definitely hide the data when the article has been deleted, as was done in this case. We store the article title in the AbuseFilter log, so we can check if it's been deleted.

However, if a public rule is triggered for a particular revision that later is deleted, AbuseFilter doesn't keep around enough information to prevent showing this.

A workaround is to restrict the abusefilter-log-detail permission. By default, it's given to *.

(In reply to comment #3)

We can definitely hide the data when the article has been deleted, as was
done in this case.

What was hidden in this case? I'm pretty sure I can see everything.

We store the article title in the AbuseFilter log, so we can
check if it's been deleted.

However, if a public rule is triggered for a particular revision that later
is
deleted, AbuseFilter doesn't keep around enough information to prevent
showing
this.

I'm not following you. Can't it just check if the title exists and choose to show the detailed information based on that?

A workaround is to restrict the abusefilter-log-detail permission. By
default,
it's given to *.

I'm not sure if restricting abusefilter-log-detail is the right solution. There are non-sysops who have the abusefilter-modify right through the EFM group who technically shouldn't be able to see deleted content, but would still be able to.

And if we're still tracking harmless things like WikiLove ([[Special:AbuseFilter/423]]) in the EF, it makes no sense to hide that from typical users.

(In reply to comment #4)

We store the article title in the AbuseFilter log, so we can
check if it's been deleted.

However, if a public rule is triggered for a particular revision that later
is
deleted, AbuseFilter doesn't keep around enough information to prevent
showing
this.

I'm not following you. Can't it just check if the title exists and choose to
show the detailed information based on that?

A page would be considered existing if it has at least one revision which is not deleted. This means that if you:

Create a page
Edit it, edit hits a filter and gets logged
Delete the page - the details of your edits should now be hidden in AF
Someone else creates the page with the same title

Suddenly, the AF entry is visible again because the title exists.

(In reply to comment #5)

A page would be considered existing if it has at least one revision which is
not deleted. This means that if you:

Create a page
Edit it, edit hits a filter and gets logged
Delete the page - the details of your edits should now be hidden in AF
Someone else creates the page with the same title

Suddenly, the AF entry is visible again because the title exists.

I'm not sure this edge case is very important.

I'm adding this to our backlog of Admin Tools, but just to flesh out the requirements:

  • The AbuseLog entry should be hidden in cases where we have a legal obligation to remove the content. So as Krenair pointed out, we can't just not show it if the page doesn't exist, otherwise anyone could create the page to read the content.
  • Is it really desired by most of the admins to have these log entries hidden whenever a page is deleted, even when, for example, it's just advertising vs copyright violation? Or would it only be for pages that are deleted for specific reasons?
  • On the overall list of priorities, is this something admins are desperate for? Or more of a nice to have right now?

(In reply to comment #7)

  • The AbuseLog entry should be hidden in cases where we have a legal

obligation
to remove the content. So as Krenair pointed out, we can't just not show it
if
the page doesn't exist, otherwise anyone could create the page to read the
content.

True, ideally the log entry would be undeleted when the page is undeleted.

  • Is it really desired by most of the admins to have these log entries hidden

whenever a page is deleted, even when, for example, it's just advertising vs
copyright violation? Or would it only be for pages that are deleted for
specific reasons?

I think its more important for say, copyvios and major BLP violations to be removed rather than things that were deleted in AFD for being [[WP:NOT]] (out of scope). Maybe just any variables which contain page text (like old_wikitext, the diff, etc) should be removed from the details & examine pages so the log entry remains, similar to how if a page is deleted, any entries in [[Special:Log/move]] aren't.

  • On the overall list of priorities, is this something admins are desperate

for? Or more of a nice to have right now?

Given that this bug has existed ever since the AbuseFilter was introduced (2 years), I don't think its that urgent since no one ever noticed it until I did.

As in interim solution, it would be nice to be able to delete certain log entries (possibly a separate bug in itself).

As in interim solution, it would be nice to be able to delete certain log
entries (possibly a separate bug in itself).

Users with the 'abusefilter-hide-log' privilege can do this. Although now that we automatically hide a revision's logs when the revision is deleted, I'm not sure if admins are in the habit of deleting the AF log when an entire page is deleted.

We may be able to hook into the UI to make hiding the AF logs easy when someone is deleting a page. I'm going to mark this as a lower priority enhancement, since it sounds like it would be useful, but the overall result is possible in the existing tools (unless I'm missing something).

Actually, I'm pretty sure this could be implemented as a pretty simple gadget, which might be a good way to try out the requirements.

I just wrote a patch which auto deletes all log entries for a given page by the time it's deleted. I'm not sure anymore this is the best way to go (as only Oversighters can see them then)... any ideas?

Alternatives would be to implement a automated deletion with user confirmation or adding a page_id field to the abuse log table.

(In reply to comment #10)

I just wrote a patch which auto deletes all log entries for a given page by
the
time it's deleted. I'm not sure anymore this is the best way to go (as only
Oversighters can see them then)... any ideas?

I don't think giving non-oversight'ers a way to oversight log entries is a good idea. AF should probably have a way to "revdel" log entries to the "viewdeleted" level, and then a further level of "suppressed", making it equivalent to normal log entries. Maybe a separate bug?

Alternatives would be to implement a automated deletion with user
confirmation
or adding a page_id field to the abuse log table.

I'm not too familiar if/how page_id's change when deleting and undeleting, but as long as it provided an way to delete them and then undelete them it would work.

Would it be common for a page that needs to be deleted and suppress to be moved around first, so that we can't match the page's title in the abusefilter logs?

Again, if we're talking about a single revision for a page (excluding the very first revision when the page is created), suppressing the revision will also suppress the AbuseFilter logs automatically. So this is only when the initial edit needs to be suppressed (since AF logs it without a revision_id, since no revision exists at the time that we're filtering), or the page title needs suppression.

How about inserting a link on the delete confirmation page (using the hook), that links to the abuse filter logs for that page, if any abusefilter logs exist for that title and the user has suppression rights? That would let someone who is suppressing data know that they need to also check the logs and see if any entries need suppression as well. But the interface would remain unchanged for most users.

(In reply to comment #12)

Would it be common for a page that needs to be deleted and suppress to be
moved
around first, so that we can't match the page's title in the abusefilter
logs?

No

Again, if we're talking about a single revision for a page (excluding the
very
first revision when the page is created), suppressing the revision will also
suppress the AbuseFilter logs automatically. So this is only when the initial
edit needs to be suppressed (since AF logs it without a revision_id, since no
revision exists at the time that we're filtering), or the page title needs
suppression.

How about inserting a link on the delete confirmation page (using the hook),
that links to the abuse filter logs for that page, if any abusefilter logs
exist for that title and the user has suppression rights? That would let
someone who is suppressing data know that they need to also check the logs
and
see if any entries need suppression as well. But the interface would remain
unchanged for most users.

That would come with a different flavor of the problem I mentioned above: Only oversighters can hide and unhide log entries (at least on WMF wikis) as we only have one hidden level. If I recall it right it's not trivial to introduce more than one... I thought about this bug (and even hacked up some stuff) in December and wasn't able to come up with a perfect solution (see above)

Marius Hoch: You assigned this issue to yourself 14 months ago.
Could you please provide a status update and inform us whether you are still working (or still plan to work) on this issue?
Only in case you do not plan to work on this issue anymore, should the assignee be set back to default and the bug status changed from ASSIGNED to NEW/UNCONFIRMED? Thanks.

(In reply to Andre Klapper from comment #14)

Marius Hoch: You assigned this issue to yourself 14 months ago.
Could you please provide a status update and inform us whether you are still
working (or still plan to work) on this issue?
Only in case you do not plan to work on this issue anymore, should the
assignee be set back to default and the bug status changed from ASSIGNED to
NEW/UNCONFIRMED? Thanks.

Well, I (hopefully) still have the patch mentioned in #c10 around (and can rebase it to work with AbuseFilter master). But I want some more feedback about this... if this feedback is given, I guess I can work on this (again).

Did we pull out the automatic hiding of deleted revisions? I'm sure I saw that, but I'm not seeing it right now. That would be along the same lines, where a non oversighter was essentially causing a suppression event. But if we don't have that, then hoo's patch probably is doing that, and seems wrong.

I think we could introduce delete and suppress for the log-- that seems like it's going to be needed in the long run, and would make this work.

(In reply to Chris Steipp from comment #16)

I think we could introduce delete and suppress for the log-- that seems like
it's going to be needed in the long run, and would make this work.

I initially thought this would require a schema change, but that's not true:

afl_deleted tinyint(1) NOT NULL DEFAULT 0,

If we have that, we can probably use it together with a slightly modified version of my patch mentioned above.

This came up again in WP:VP/T today.

There are two factors here

  1. the filter is not set as do not see, it can be hidden if necessary
  2. filters can be oversighted if necessary, and it has not been done.

Articles at enWP are deleted for not being notable, or being advertising. So why do we need to delete it?

There are means to manage, I don't see that there is any need for an automated process. I see no clear case to why this is a problem and needs to be handled automatically.

@Billinghurst:

  1. On the English Wikipedia, at least, there is the view that filters should only be hidden from the community if absolutely necessary. Hiding all the filters which could show deleted content seems a poor way of fixing the issue.
  1. Requesting that oversighters patrol every deleted article to see if its contents can be seen in a filter log and then oversighting those entries also doesn't seem a good solution.

@Samwalton9 So due to "enWP wishes" this all changes? FWIW I am asking the oversighters to do nothing. I am saying that there are solutions available within the system, use them.

To the presented case ... someone can see the content of a deleted edit. usually I would say for the vast bulk of deleted edits BIG DEAL. The example given was a spam article, I care not a hoot that someone wishes to cruise through the AF hits and see a spam edit that has been deleted. The edit is not visible to users, it does not show in search engines and it doesn't contaminate the wiki. What is the problem? Please come back with a more valid set of examples and answers. Purity gone mad!

There are so many feature wishes out there with far more validity and benefit than this where there has been no demonstrated value in having it done beyond noise from enWP.

@Billinghurst I gave the English Wikipedia's view as an example because that's the only community on which I know anything about views on the abuse filter, I never said that this should definitely happen specifically because en-wiki wants it to.

The concerns raised are not spam articles but rather copyright violations and attack pages. I agree that this isn't some hugely important change that needs to happen right this second - it being Normal priority reflects that - I just didn't think your arguments against this being a problem at all were particularly valid.

@Samwalton9 The whole case presented here in phabricator is about a spam article, if you wish to reframe the case then go for it. We can only argue the case on the proposal before us.

I put it to you that the proposal is either arse about or half-arsed as there is NO one to one relationship to deleted articles and not viewing them in abuse filters, which is the proposal here,, ie. it hit an abuse filter, it has been deleted, therefore delete abuse filter without consideration for the purpose of the filter, the nature of the deletion be it a move, etc.

Maybe you can explain to me what happens to abuse filter logs for copyright violations and attack pages where a warning or a disallow take place but no article is created? They would seem to have the same sort of content, should undergo the same treatment yet are not captured by the proposed resolution. They continue to sit there until someone sights them and determines whether they should be oversighted (though of course not by an oversighter as they don't patrol logs, by your argument)

I put it to you that what is wanted here is that if an article is deleted, that the deleting admin should see an alert that the creating edit hit a spam filter, be able to quickly review it and be able to delete if necessary. Further that there is actually the requirement for the easier deletion of abuse filter log items though not where an oversight is required. Not this convoluted argument.

And with new entries? I see that some entries are now automatically hidden because the revisions were hidden but I can't remember if that happens with deletions, revision deletion or both. Thanks.

I can't find an example for revdel right now.

Well, it's true that this problem isn't easy to fix per comments above, otherwise it'd be pretty easy. I also thought of a couple of ways to determine whether the page was deleted, but all of them require having some more info in abuse_filter_log rows. Specifically, having afl_rev_id for page creations: with that, one could check if a revision with such id exists in the 'revision' table so to make a clear distinction. However, I can't see other ideas.

Daimona added subscribers: DannyS712, Huji.

Any thought to having the abuse filter log entry go through a "sanitizer" before it is displayed, so that only the fields the reader is entitled to see are shown?

Daimona added a subscriber: Asartea.
Daimona added a subscriber: Writ_Keeper.

I still think that it should just be easier to hide a problematic publicly viewable abuse report, that can be by deletion or setting an individual report to non-public. ~~~~

Change 712942 had a related patch set uploaded (by Daimona Eaytoy; author: Daimona Eaytoy):

[mediawiki/extensions/AbuseFilter@master] Check for deleted pages in SpecialAbuseLog

https://gerrit.wikimedia.org/r/712942