In 1.14.0 RELEASE-NOTES we see
- $wgSpamRegex now matches the edit summary and page move descriptions in addition to body text.
I'm sorry, but that's absolutely crazy, reckless, irresponsible. I'm
commenting it out in EditPage.php:
Check for spam
$match = false; #JIDANNI turning OFF!!: $match = self::matchSpamRegex( $this->summary );
Please consider e.g.,:
$wgSpamRegex=array('/^\B$/',
This regular expression is what our wiki uses to prevent vicious page
blanking. (By the way, if one triggers it, oddly the function that
usually shows the user what the problem was doesn't say anything.)
Anyway, a blanked page is bad, but a blank comment is fine!
Now let's look at another regexp we use on our sites:
'/^[^{][[:ascii:]]*$/');
This regular expression means the user's edit must have at least one
Chinese character in it, because our wikis are all zh-tw language
wikis, and a pure ASCII post is surely spam.
However, a quick English, or NULL _summary_ is very common and
accepted on our wikis.
Anyway, the rash decision to glue 'edit summary', 'page move descriptions'
'body text' together will have users banging down my door saying why
are their postings getting rejected now! *Please let the
administrator glue them together if he wishes!:
($wgSpamRegex['edit summary']= $wgSpamRegex['page move descriptions']=
$wgSpamRegex['body text'];) Don't arbitrarily glue them all together
for us! *
Please instead run each one as a separate test.
You (MediaWiki team) can have an array of arrays, and just do something like the PHP
version of foreach('edit summary', 'page move description', 'body
text' as $bla){ run the matcher of $wgSpamRegex[$bla] on $get->$bla}
or however you write it in PHP, which I am poor at.
And of course you need three different MediaWiki:Spamprotectiontext
now too. And please allow us to set them in LocalSettings.php:
$wgSpamProtectionText['body text']= and the other two too. Setting
them in MediaWiki:Spamprotectiontext is a big pain when you are making
a Wiki Family.
By the way, we also have a rule
/{{[Cc]\|\d\d\d\.\d{0,3}}}/
that I mention in Spamprotectiontext:
Radio frequencies must have at least four digits after the decimal place.
What would be neat is if each regexp could have its own optional text
that gets printed out.
Ah, you might say I should stop complaining and use this mentioned in DefaultSettings.php:
- For a complete example, have a look at the SpamBlacklist extension. */ $wgFilterCallback = false;
Well I'll have you know that I did look at it, and it is all 100 times
overkill and un-understandable gobbledygook, so sorry. It didn't help
me one bit.
Anyway, I was doing fine until you glued all the tests together.
Next time I'll test while your release candidate is fresh. Sorry I
only discovered this (glue mess) now.
By the way, I also use /<[Aa]/, which stops attempted spam links. This
regexp I wish to use in all three places: summary, body text, etc.
I.e., I cannot live for long with no summary filtering (caused by my
above commenting out), as I know it is only a matter of time before they
attack, therefore I hope you will separate the three tests (and not
just toss in some var $ignoreEditSummary), by version 1.14.1. Thank
you.
Version: unspecified
Severity: normal