Page MenuHomePhabricator

SqlBagOStuff::incr() vulnerable to race conditions, dies with query error
Closed, ResolvedPublic

Description

SqlBagOStuff uses a SELECT FOR UPDATE followed by a DELETE followed by an INSERT. This seems to be deliberate to ensure atomicity, see r55079. However, the following race condition is possible:

  • Process 1 calls incr()
  • Process 2 calls incr()
  • Process 1 does the SELECT FOR UPDATE
  • Process 2 does the SELECT FOR UPDATE
  • Process 1 DELETEs the row
  • Process 2 tries to DELETE the row. The DELETE fails silently (0 rows affected)
  • Process 1 INSERTs the new row
  • Process 2 tries to INSERT the new row but dies with a duplicate key error

Because incr() bails if the SELECT FOR UPDATE doesn't return a row, this is a somewhat unlikely race condition, but I've seen it happen several times on my test install when hard refreshing, because each load.php instance calls incr( 'requests-with-session' ) or something similar.

A similar bug in SqlBagOStuff::set() used to exist, see bug 24425 and r70203.

(Note that UPDATE with value=value+1 can't be used because the value is a serialized blob rather than a proper integer.)

This is a really nasty bug that'll cause intermittent RL bugs on SQL-cache-based installs, so I'm making this a blocker for the 1.17 release.


Version: unspecified
Severity: critical

Details

Reference
bz28611

Event Timeline

bzimport raised the priority of this task from to Low.Nov 21 2014, 11:32 PM
bzimport set Reference to bz28611.

Did it happen to you when using InnoDB? It's wrapped in a transaction.

The problem is there for MyISAM, though. I commited a fix in r86418 which should fix it.

We could also have a special case for ints and store them not serialized. It will also work faster this way (UPDATE objectcache SET value=value+1 WHERE keyname='foo').

Is there a caller for incr() in a default installation? It used to be that incr() was only called on $wgMemc, and nobody ever sets $wgMainCacheType to CACHE_DB.

The only callers I can find in trunk core or extensions are using $wgMemc or the explicit calls in mctest.php which is memcache-specific.

A special case with UPDATE probably wouldn't be too hard to do as suggested and might simplify the code a bit -- but then again, it'd need to check for expirations (if the old entry expired, then we should be incrementing/decrementing from 0 as if it were brand new) and needs to return the new value, so would still need to go back and read.

It wouldn't be too hard, no, I was just wondering about the priority and severity.

I'll defer to your judgment on the priority for possible race conditions. Please set this one to what you feel is appropriate.

(In reply to comment #3)

Is there a caller for incr() in a default installation? It used to be that
incr() was only called on $wgMemc, and nobody ever sets $wgMainCacheType to
CACHE_DB.

I have MainCacheType set to CACHE_DB, maybe that's why. I didn't realize this wasn't the default, and you're right that other caching backends don't have this problem, so I guess that lowers the severity.

(In reply to comment #1)

Did it happen to you when using InnoDB? It's wrapped in a transaction.

Yes, it seems I have a MyISAM database. No idea why.

The problem is there for MyISAM, though. I commited a fix in r86418 which
should fix it.

Yup, that fixes it. Running while(true) $wgMemc->incr('foo'); concurrently in two eval.php instances used to hit the bug immediately but runs fine now.

Filed bug 28669 regading MyISAM in general.