Page MenuHomePhabricator

Revision::fetchFromConds SELECT ... FOR UPDATE invalid in Postgres
Closed, ResolvedPublic

Description

Author: overlordq

Description:

Apr 9 20:39:11 thedarkcitadel postgres[21778]: [3-1] 2013-04-09 20:39:10 UTC ERROR: SELECT FOR UPDATE/SHARE cannot be applied to the nullable side of an outer join

Apr 9 20:39:11 thedarkcitadel postgres[21778]: [3-2] 2013-04-09 20:39:10 UTC STATEMENT: SELECT /* Revision::fetchFromConds 127.0.0.1 */ rev_id,rev_page,rev_text_id,rev_timestamp,rev_comment,rev_user_text,rev_user,rev_minor_edit,rev_deleted,rev_len,rev_parent_id,rev_sha1,rev_content_format,rev_content_model,page_namespace,page_title,page_id,page_latest,page_is_redirect,page_len,user_name FROM "unittest_revision" INNER JOIN "unittest_page" ON ((page_id = rev_page)) LEFT JOIN "unittest_mwuser" ON ((rev_user != 0) AND (user_id = rev_user)) WHERE page_id = '649' AND rev_id = '1055' LIMIT 1 FOR UPDATE

This is likely what is breaking other phpunit tests


Version: 1.22.0
Severity: normal
See Also:
https://bugzilla.wikimedia.org/show_bug.cgi?id=46594

Details

Reference
bz47055

Related Objects

View Standalone Graph
This task is connected to more than 200 other tasks. Only direct parents and subtasks are shown here. Use View Standalone Graph to show more of the graph.

Event Timeline

bzimport raised the priority of this task from to High.Nov 22 2014, 1:24 AM
bzimport set Reference to bz47055.

fred wrote:

i patched my Revision.php in the includes directory (1.21.1) with the following:

diff Revision.php.org Revision.php

355a356

global $wgDBtype;

363c364,366

< $options[] = 'FOR UPDATE';

if( $wgDBtype != "postgres" ) {
        $options[] = 'FOR UPDATE';
}

Hi! Thanks for your patch!

You are welcome to use Developer access

https://www.mediawiki.org/wiki/Developer_access

to submit this as a Git branch directly into Gerrit:

https://www.mediawiki.org/wiki/Git/Tutorial

Putting your branch in Git makes it easier to review it quickly.
Thanks again! We appreciate your contribution.

fred wrote:

Thanks Andre,
I did as you suggested.
https://gerrit.wikimedia.org/r/#/c/68143/
Regards
Fred

So what I would recommend is that rather than making this change in Revision::fetchFromConds, instead make the change directly in DatabasePostgres (maybe override the selectSQLText() function and have it remove the FOR UPDATE option). That way this error won't occur anywhere.

overlordq wrote:

The problem is that PG *does* support FOR UPDATE, just limits it to cases that make sense.

Why is the userJoinCond a LEFT JOIN when you're fetching rows for logged-in users which should have a row in the users table?

(In reply to comment #5)

The problem is that PG *does* support FOR UPDATE, just limits it to cases
that
make sense.

Then maybe an exception should be thrown instead, as a warning to developers that you shouldn't be doing FOR UPDATE on outer joins.

(In reply to comment #5)

Why is the userJoinCond a LEFT JOIN when you're fetching rows for logged-in
users which should have a row in the users table?

The issue is that the row doesn't necessarily exist in the user table. If the revision was made by an anonymous user, then there will be no corresponding row. The idea is to fetch the user row if it exists.

(In reply to comment #6)

The issue is that the row doesn't necessarily exist in the user table. If the
revision was made by an anonymous user, then there will be no corresponding
row. The idea is to fetch the user row if it exists.

That is wrong, because there is a rev_user = 0 on the join condition, so this is false for anon editing. The query should always find a user row.

(In reply to comment #7)

That is wrong, because there is a rev_user = 0 on the join condition, so this
is false for anon editing. The query should always find a user row.

"on the join condition" is the key phrase here. If rev_user = 0 were in the WHERE clause, then that'd be true, but in an outer join, if the join condition doesn't match, the row is still there but just has a null join. Also think about it logically: why would Revision::fetchFromConds, the main function for fetching revision info, only allow logged in revisions?

fred wrote:

Hi Thanks all for exploring options,
Some related discussions I found when researching the error:
http://www.postgresql.org/docs/9.1/interactive/sql-select.html#SQL-FOR-UPDATE-SHARE
http://www.postgresql.org/docs/9.1/interactive/explicit-locking.html#LOCKING-ROWS

http://www.postgresql.org/message-id/21634.1160151923@sss.pgh.pa.us
http://postgresql.1045698.n5.nabble.com/outer-joins-and-for-update-td1937029.html

Also, my PostgreSQL server version is 9.1.2

If we can reach an agreement on how best to resolve this issue for Wikimedia installs using a PostgreSQL databases, (unless someone beats me to it) I'll rework the patch to comply to your collective wisdom, thank you all for your time.

OK, so with all of this, I also noticed one more thing: PostgreSQL has an extended FOR UPDATE syntax. You can do "FOR UPDATE OF mytable", which will only lock that table.

With that in mind, I think the proper fix for this would be to change DatabasePostgresql so that when the FOR UPDATE clause is generated, it only includes the main table and inner joined tables.

delroth wrote:

This bug causes image deletion to fail on PostgreSQL, by the way. Manually applying the workaround mentioned above fixes the problem.

Related URL: https://gerrit.wikimedia.org/r/69767 (Gerrit Change I1ac587ac39f448b9e7f4befb44826b43044ad6f0)

What about if we require callers to explicitly specify list of tables to be locked (by using additional parameters and not $options[]) ?

Change 69767 abandoned by coren:
Changed FOR UPDATE handling in Postgresql

Reason:
Worked only because it was broken the right way. :-)

https://gerrit.wikimedia.org/r/69767

(In reply to comment #10)

PostgreSQL has an
extended FOR UPDATE syntax. You can do "FOR UPDATE OF mytable", which will
only
lock that table.

Hmm, no. FOR UPDATE OF isn't a postgres extension, but a part SQL which mysql doesn't implement fully (SQL:2003 14.1.).

But yeah, trying to lock on an outer join is fundamentally nonsensical, and mysql not throwing an exception is a standard compliance bug at best - it's unwise to rely on that kind of behaviour.

fred wrote:

I agree with Marc,
It is broken the wrong way, there is still an issue that needs to be resolved.

(In reply to comment #15)

(In reply to comment #10)

PostgreSQL has an
extended FOR UPDATE syntax. You can do "FOR UPDATE OF mytable", which will
only
lock that table.

Hmm, no. FOR UPDATE OF isn't a postgres extension, but a part SQL which
mysql
doesn't implement fully (SQL:2003 14.1.).

But yeah, trying to lock on an outer join is fundamentally nonsensical, and
mysql not throwing an exception is a standard compliance bug at best - it's
unwise to rely on that kind of behaviour.

Change 69767 restored by Parent5446:
Changed FOR UPDATE handling in Postgresql

Reason:
Why was this abandoned? It's my change.

https://gerrit.wikimedia.org/r/69767

Change 107112 had a related patch set uploaded by MarkAHershberger:
Changed FOR UPDATE handling in Postgresql

https://gerrit.wikimedia.org/r/107112

Change 69767 merged by jenkins-bot:
Changed FOR UPDATE handling in Postgresql

https://gerrit.wikimedia.org/r/69767

Change 107112 merged by jenkins-bot:
Changed FOR UPDATE handling in Postgresql

https://gerrit.wikimedia.org/r/107112

Jdforrester-WMF subscribed.

Migrating from the old tracking task to a tag for PostgreSQL-related tasks.