Page MenuHomePhabricator

CodeReview has many inefficient queries
Closed, ResolvedPublic

Description

Unindexed queries

Other stuff such as

// This way of getting GET parameters is horrible, but effective.
$fields = array_merge( $_GET, $_POST );


Version: unspecified
Severity: normal

Details

Reference
bz24479

Event Timeline

bzimport raised the priority of this task from to Medium.Nov 21 2014, 11:01 PM
bzimport set Reference to bz24479.

ayg wrote:

We have a component for CodeReview already, so we don't need a tracker bug for everything wrong with it . . .

Retitling bug.

Roan addressed second issue in CR r47395#c7849

I've got the mysql high performance book if anyone wants it for that bug

The query for listing revisions by path does something like WHERE cp_repo_id=1 AND cp_path LIKE '/trunk%' ORDER BY cp_rev_id DESC . If my understanding about how the code_paths table works is correct (will confirm in the morning), we should be able to convert that to AND cp_path='/trunk' , introduce an index on (cp_repo_id, cp_path, cp_rev_id) and that should make this query happy.

Special:Code/MediaWiki/tag/api

ie

SELECT cr_id,cr_repo_id,cr_status,COUNT(DISTINCT cc_id) AS comments,cr_path,cr_message,cr_author,cr_timestamp FROM code_tags,code_rev LEFT JOIN code_comment ON ((cc_repo_id = cr_repo_id AND cc_rev_id = cr_id)) WHERE cr_repo_id = '3' AND (cr_repo_id=ct_repo_id) AND (cr_id=ct_rev_id) AND ct_tag = 'api' GROUP BY cr_id ORDER BY cr_id DESC LIMIT 51;

Filesorts according to VVV

<vvv> Reedy: the generic problem is: there is code_something with cs_repo_id, cs_rev_id and cs_value; the only index is primary key on all three columns in the order listed. There is code_rev which is joined with code_something by cr_repo_id and cr_id.
<vvv> The query on code_something filesorts

It's the order by/group by not being on the first table...

mysql> describe SELECT cr_id,cr_repo_id,cr_status,COUNT(DISTINCT cc_id) AS comments,cr_path,cr_message,cr_author,cr_timestamp FROM mw_code_tags,mw_code_rev LEFT JOIN mw_code_comment ON ((cc_repo_id = cr_repo_id AND cc_rev_id = cr_id)) WHERE cr_repo_id = '3' AND (ct_repo_id = '3') AND (cr_id=ct_rev_id) AND ct_tag = 'api' LIMIT 51;
+----+-------------+-----------------+--------+-----------------------------------+------------+---------+--------------------------------------------------------+------+--------------------------+

idselect_typetabletypepossible_keyskeykey_lenrefrowsExtra

+----+-------------+-----------------+--------+-----------------------------------+------------+---------+--------------------------------------------------------+------+--------------------------+

1SIMPLEmw_code_tagsrefPRIMARY,ct_repo_idPRIMARY4const1Using where; Using index
1SIMPLEmw_code_reveq_refPRIMARY,cr_repo_id,cr_repo_authorPRIMARY8const,wikidb.mw_code_tags.ct_rev_id1
1SIMPLEmw_code_commentrefcc_repo_id,cc_repo_timecc_repo_id8wikidb.mw_code_rev.cr_repo_id,wikidb.mw_code_rev.cr_id1Using index

+----+-------------+-----------------+--------+-----------------------------------+------------+---------+--------------------------------------------------------+------+--------------------------+
3 rows in set (0.00 sec)

mysql> describe SELECT cr_id,cr_repo_id,cr_status,COUNT(DISTINCT cc_id) AS comments,cr_path,cr_message,cr_author,cr_timestamp FROM mw_code_tags,mw_code_rev LEFT JOIN mw_code_comment ON ((cc_repo_id = cr_repo_id AND cc_rev_id = cr_id)) WHERE cr_repo_id = '3' AND (ct_repo_id = '3') AND (cr_id=ct_rev_id) AND ct_tag = 'api' GROUP BY cr_id ORDER BY cr_id DESC LIMIT 51;
+----+-------------+-----------------+--------+-----------------------------------+------------+---------+--------------------------------------------------------+------+-----------------------------------------------------------+

idselect_typetabletypepossible_keyskeykey_lenrefrowsExtra

+----+-------------+-----------------+--------+-----------------------------------+------------+---------+--------------------------------------------------------+------+-----------------------------------------------------------+

1SIMPLEmw_code_tagsrefPRIMARY,ct_repo_idPRIMARY4const1Using where; Using index; Using temporary; Using filesort
1SIMPLEmw_code_reveq_refPRIMARY,cr_repo_id,cr_repo_authorPRIMARY8const,wikidb.mw_code_tags.ct_rev_id1
1SIMPLEmw_code_commentrefcc_repo_id,cc_repo_timecc_repo_id8wikidb.mw_code_rev.cr_repo_id,wikidb.mw_code_rev.cr_id1Using index

+----+-------------+-----------------+--------+-----------------------------------+------------+---------+--------------------------------------------------------+------+-----------------------------------------------------------+
3 rows in set (0.00 sec)

mysql> describe SELECT cr_id,cr_repo_id,cr_status,COUNT(DISTINCT cc_id) AS comments,cr_path,cr_message,cr_author,cr_timestamp FROM mw_code_tags,mw_code_rev LEFT JOIN mw_code_comment ON ((cc_repo_id = cr_repo_id AND cc_rev_id = cr_id)) WHERE cr_repo_id = '3' AND (ct_repo_id = '3') AND (cr_id=ct_rev_id) AND ct_tag = 'api' LIMIT 51;
+----+-------------+-----------------+--------+-----------------------------------+------------+---------+--------------------------------------------------------+------+--------------------------+

idselect_typetabletypepossible_keyskeykey_lenrefrowsExtra

+----+-------------+-----------------+--------+-----------------------------------+------------+---------+--------------------------------------------------------+------+--------------------------+

1SIMPLEmw_code_tagsrefPRIMARY,ct_repo_idPRIMARY4const1Using where; Using index
1SIMPLEmw_code_reveq_refPRIMARY,cr_repo_id,cr_repo_authorPRIMARY8const,wikidb.mw_code_tags.ct_rev_id1
1SIMPLEmw_code_commentrefcc_repo_id,cc_repo_timecc_repo_id8wikidb.mw_code_rev.cr_repo_id,wikidb.mw_code_rev.cr_id1Using index

+----+-------------+-----------------+--------+-----------------------------------+------------+---------+--------------------------------------------------------+------+--------------------------+
3 rows in set (0.00 sec)

mysql> describe SELECT cr_id,cr_repo_id,cr_status,COUNT(DISTINCT cc_id) AS comments,cr_path,cr_message,cr_author,cr_timestamp FROM mw_code_tags,mw_code_rev LEFT JOIN mw_code_comment ON ((cc_repo_id = cr_repo_id AND cc_rev_id = cr_id)) WHERE cr_repo_id = '3' AND (ct_repo_id = '3') AND (cr_id=ct_rev_id) AND ct_tag = 'api' GROUP BY cr_id ORDER BY cr_id DESC LIMIT 51;
+----+-------------+-----------------+--------+-----------------------------------+------------+---------+--------------------------------------------------------+------+-----------------------------------------------------------+

idselect_typetabletypepossible_keyskeykey_lenrefrowsExtra

+----+-------------+-----------------+--------+-----------------------------------+------------+---------+--------------------------------------------------------+------+-----------------------------------------------------------+

1SIMPLEmw_code_tagsrefPRIMARY,ct_repo_idPRIMARY4const1Using where; Using index; Using temporary; Using filesort
1SIMPLEmw_code_reveq_refPRIMARY,cr_repo_id,cr_repo_authorPRIMARY8const,wikidb.mw_code_tags.ct_rev_id1
1SIMPLEmw_code_commentrefcc_repo_id,cc_repo_timecc_repo_id8wikidb.mw_code_rev.cr_repo_id,wikidb.mw_code_rev.cr_id1Using index

+----+-------------+-----------------+--------+-----------------------------------+------------+---------+--------------------------------------------------------+------+-----------------------------------------------------------+
3 rows in set (0.00 sec)

mysql> describe SELECT cr_id,cr_repo_id,cr_status,COUNT(DISTINCT cc_id) AS comments,cr_path,cr_message,cr_author,cr_timestamp FROM mw_code_tags,mw_code_rev LEFT JOIN mw_code_comment ON ((cc_repo_id = cr_repo_id AND cc_rev_id = cr_id)) WHERE cr_repo_id = '3' AND (ct_repo_id = '3') AND (cr_id=ct_rev_id) AND ct_tag = 'api' ORDER BY cr_id DESC LIMIT 51;
+----+-------------+-----------------+--------+-----------------------------------+------------+---------+--------------------------------------------------------+------+-------------------------------------------+

idselect_typetabletypepossible_keyskeykey_lenrefrowsExtra

+----+-------------+-----------------+--------+-----------------------------------+------------+---------+--------------------------------------------------------+------+-------------------------------------------+

1SIMPLEmw_code_tagsrefPRIMARY,ct_repo_idPRIMARY4const1Using where; Using index; Using temporary
1SIMPLEmw_code_reveq_refPRIMARY,cr_repo_id,cr_repo_authorPRIMARY8const,wikidb.mw_code_tags.ct_rev_id1
1SIMPLEmw_code_commentrefcc_repo_id,cc_repo_timecc_repo_id8wikidb.mw_code_rev.cr_repo_id,wikidb.mw_code_rev.cr_id1Using index

+----+-------------+-----------------+--------+-----------------------------------+------------+---------+--------------------------------------------------------+------+-------------------------------------------+

Order BY only uses temporary
Group By (With/without order by) uses temporary and file sorts

I suppose, also, sorting/group by cr_id isn't much good when it's unindexed directly/as the first column

PK is cr_repo_id, cr_id....

Though, adding the index doesn't help.

The discussion that Roan and I had at my house, seemed to suggest that it might be a case of there's nothing we can do about some of these queries.

And as such, the hard coded limits in place, might just be removed ("fixing" the other bugs), and just have to deal with things maybe being a bit slow.

If this then annoys domas, maybe he can help improve the queries? ;)

ayg wrote:

(In reply to comment #7)

mysql> describe SELECT cr_id,cr_repo_id,cr_status,COUNT(DISTINCT cc_id) AS
comments,cr_path,cr_message,cr_author,cr_timestamp FROM
mw_code_tags,mw_code_rev LEFT JOIN mw_code_comment ON ((cc_repo_id =
cr_repo_id AND cc_rev_id = cr_id)) WHERE cr_repo_id = '3' AND (ct_repo_id =
'3') AND (cr_id=ct_rev_id) AND ct_tag = 'api' GROUP BY cr_id ORDER BY cr_id
DESC LIMIT 51;
+----+-------------+-----------------+--------+-----------------------------------+------------+---------+--------------------------------------------------------+------+-----------------------------------------------------------+

idselect_typetabletypepossible_keys
keykey_lenref
rowsExtra

+----+-------------+-----------------+--------+-----------------------------------+------------+---------+--------------------------------------------------------+------+-----------------------------------------------------------+

1SIMPLEmw_code_tagsrefPRIMARY,ct_repo_id
PRIMARY4const
1Using where; Using index; Using temporary; Using filesort
1SIMPLEmw_code_reveq_ref

PRIMARY,cr_repo_id,cr_repo_author | PRIMARY | 8 |
const,wikidb.mw_code_tags.ct_rev_id | 1 |

|
1SIMPLEmw_code_commentrefcc_repo_id,cc_repo_time
cc_repo_id8

wikidb.mw_code_rev.cr_repo_id,wikidb.mw_code_rev.cr_id | 1 | Using index

|

+----+-------------+-----------------+--------+-----------------------------------+------------+---------+--------------------------------------------------------+------+-----------------------------------------------------------+
3 rows in set (0.00 sec)

This is a classic pitfall for the MySQL optimizer. What version are you using? On the toolserver, 5.1.53, it avoids the filesort.

The problem is that you have a few separate conditions that you need indexes for, aside for join conditions and assuming MySQL is smart enough to do equality propagation (not always a good bet): ct_tag = 'api', ct_repo_id = '3', and the order by/group by. No one index will work for all of them, so MySQL has to choose which index it wants to use. Since there's a LIMIT, the filesort option is probably the worst, but it chooses that option anyway. Often in this case a FORCE INDEX will fix the problem. On the toolserver, it's smart enough to select from code_rev first and not do a filesort, and then do extra scanning on code_tags.

In this case, your best bet is probably to add an index on (ct_repo_id, ct_tag, ct_rev_id), if this query is a problem in practice after you stop it from filesorting. If you're not ever ordering or grouping by ct_tag, you can just change the primary key (but I'd guess you sometimes do group by it).

I'd also clean up the query a bit to avoid relying on equality propagation, changing "cr_repo_id = '3'" to "cr_repo_id = ct_repo_id" and changing "GROUP BY cr_id ORDER BY cr_id DESC" to "GROUP BY ct_rev_id ORDER BY ct_rev_id DESC". But the latter part is probably not necessary, just makes it clearer how you expect the query to execute.

ayg wrote:

(Of course, the query isn't scalable anyway due to COUNT(). It will have to scan arbitrarily many rows. The only way to fully fix it would be to use a summary table of some kind, but those are a pain to maintain, so better not to do that unless you can't fix it some other way. It's not like we're going to have 100 million code reviews anytime soon.)

(In reply to comment #10)

This is a classic pitfall for the MySQL optimizer. What version are you using?
On the toolserver, 5.1.53, it avoids the filesort.

reedy@ubuntu64-esxi:~$ mysql --version
mysql Ver 14.14 Distrib 5.1.49, for debian-linux-gnu (x86_64) using readline 6.1

What comes with Ubuntu 10.10 x64 Server

  • Freetext tagging for revisions

CREATE TABLE /*_*/code_tags (

ct_repo_id int not null,
ct_rev_id int not null,
ct_tag varbinary(255) not null,

primary key (ct_repo_id,ct_rev_id,ct_tag)

) /*$wgDBTableOptions*/;

CREATE INDEX /*i*/ct_repo_id ON /*_*/code_tags (ct_repo_id,ct_tag,ct_rev_id);

That suggested index is seemingly already there...

It seems like on general you're agreeing with the just leave them be... Versions that improve/fix the optimiser will improve queries, and like you said, we're not going to have millions upon millions of revisions...

For the equality propagation, the originals had foo = bar AND bar = 3 (or something similar), i was just poking around and changed for explicit values.

Seemingly, the best thing is maybe getting it to do FORCE INDEX.. But it isn't garunteed to help

Maybe just remove the technical restrictions people have put in, which are causing some annoyances, and close the bugs...?

ayg wrote:

Oops, yeah, I missed the index. Try adding FORCE INDEX (ct_repo_id) and see what that gives you.

mysql> describe SELECT cr_id,cr_repo_id,cr_status,COUNT(DISTINCT cc_id) AS comments,cr_path,cr_message,cr_author,cr_timestamp FROM mw_code_tags,mw_code_rev LEFT JOIN mw_code_comment ON ((cc_repo_id = cr_repo_id AND cc_rev_id = cr_id)) WHERE cr_repo_id = ct_repo_id AND (ct_repo_id = '3') AND (cr_id=ct_rev_id) AND ct_tag = 'api' GROUP BY cr_id ORDER BY cr_id DESC LIMIT 51;
+----+-------------+-----------------+--------+-----------------------------------------+------------+---------+--------------------------------------------------------+------+-----------------------------------------------------------+

idselect_typetabletypepossible_keyskeykey_lenrefrowsExtra

+----+-------------+-----------------+--------+-----------------------------------------+------------+---------+--------------------------------------------------------+------+-----------------------------------------------------------+

1SIMPLEmw_code_tagsrefPRIMARY,ct_repo_idPRIMARY4const1Using where; Using index; Using temporary; Using filesort
1SIMPLEmw_code_reveq_refPRIMARY,cr_repo_id,cr_repo_author,cr_idPRIMARY8const,wikidb.mw_code_tags.ct_rev_id1Using where
1SIMPLEmw_code_commentrefcc_repo_id,cc_repo_timecc_repo_id8wikidb.mw_code_rev.cr_repo_id,wikidb.mw_code_rev.cr_id1Using index

+----+-------------+-----------------+--------+-----------------------------------------+------------+---------+--------------------------------------------------------+------+-----------------------------------------------------------+
3 rows in set (0.00 sec)

mysql> describe SELECT cr_id,cr_repo_id,cr_status,COUNT(DISTINCT cc_id) AS comments,cr_path,cr_message,cr_author,cr_timestamp FROM mw_code_tags FORCE INDEX (ct_repo_id),mw_code_rev LEFT JOIN mw_code_comment ON ((cc_repo_id = cr_repo_id AND cc_rev_id = cr_id)) WHERE cr_repo_id = ct_repo_id AND (ct_repo_id = '3') AND (cr_id=ct_rev_id) AND ct_tag = 'api' GROUP BY cr_id ORDER BY cr_id DESC LIMIT 51;
+----+-------------+-----------------+--------+-----------------------------------------+------------+---------+--------------------------------------------------------+------+-----------------------------------------------------------+

idselect_typetabletypepossible_keyskeykey_lenrefrowsExtra

+----+-------------+-----------------+--------+-----------------------------------------+------------+---------+--------------------------------------------------------+------+-----------------------------------------------------------+

1SIMPLEmw_code_tagsrefct_repo_idct_repo_id261const,const1Using where; Using index; Using temporary; Using filesort
1SIMPLEmw_code_reveq_refPRIMARY,cr_repo_id,cr_repo_author,cr_idPRIMARY8const,wikidb.mw_code_tags.ct_rev_id1Using where
1SIMPLEmw_code_commentrefcc_repo_id,cc_repo_timecc_repo_id8wikidb.mw_code_rev.cr_repo_id,wikidb.mw_code_rev.cr_id1Using index

+----+-------------+-----------------+--------+-----------------------------------------+------------+---------+--------------------------------------------------------+------+-----------------------------------------------------------+
3 rows in set (0.00 sec)

No improvement unfortunately :(

ayg wrote:

Then it's probably lack of equality propagation. Change cr_id to ct_rev_id in the ORDER BY and GROUP BY clauses.

mysql> describe SELECT cr_id,cr_repo_id,cr_status,COUNT(DISTINCT cc_id) AS comments,cr_path,cr_message,cr_author,cr_timestamp FROM mw_code_tags FORCE INDEX (ct_repo_id),mw_code_rev LEFT JOIN mw_code_comment ON ((cc_repo_id = cr_repo_id AND cc_rev_id = cr_id)) WHERE cr_repo_id = ct_repo_id AND (ct_repo_id = '3') AND (cr_id=ct_rev_id) AND ct_tag = 'api' GROUP BY ct_rev_id ORDER BY ct_rev_id DESC LIMIT 51;
+----+-------------+-----------------+--------+-----------------------------------------+------------+---------+--------------------------------------------------------+------+--------------------------+

idselect_typetabletypepossible_keyskeykey_lenrefrowsExtra

+----+-------------+-----------------+--------+-----------------------------------------+------------+---------+--------------------------------------------------------+------+--------------------------+

1SIMPLEmw_code_tagsrefct_repo_idct_repo_id261const,const1Using where; Using index
1SIMPLEmw_code_reveq_refPRIMARY,cr_repo_id,cr_repo_author,cr_idPRIMARY8const,wikidb.mw_code_tags.ct_rev_id1Using where
1SIMPLEmw_code_commentrefcc_repo_id,cc_repo_timecc_repo_id8wikidb.mw_code_rev.cr_repo_id,wikidb.mw_code_rev.cr_id1Using index

+----+-------------+-----------------+--------+-----------------------------------------+------------+---------+--------------------------------------------------------+------+--------------------------+
3 rows in set (0.00 sec)

mysql> describe SELECT cr_id,cr_repo_id,cr_status,COUNT(DISTINCT cc_id) AS comments,cr_path,cr_message,cr_author,cr_timestamp FROM mw_code_tags, mw_code_rev LEFT JOIN mw_code_comment ON ((cc_repo_id = cr_repo_id AND cc_rev_id = cr_id)) WHERE cr_repo_id = ct_repo_id AND (ct_repo_id = '3') AND (cr_id=ct_rev_id) AND ct_tag = 'api' GROUP BY ct_rev_id ORDER BY ct_rev_id DESC LIMIT 51;
+----+-------------+-----------------+--------+-----------------------------------------+------------+---------+--------------------------------------------------------+------+--------------------------+

idselect_typetabletypepossible_keyskeykey_lenrefrowsExtra

+----+-------------+-----------------+--------+-----------------------------------------+------------+---------+--------------------------------------------------------+------+--------------------------+

1SIMPLEmw_code_tagsrefPRIMARY,ct_repo_idPRIMARY4const1Using where; Using index
1SIMPLEmw_code_reveq_refPRIMARY,cr_repo_id,cr_repo_author,cr_idPRIMARY8const,wikidb.mw_code_tags.ct_rev_id1Using where
1SIMPLEmw_code_commentrefcc_repo_id,cc_repo_timecc_repo_id8wikidb.mw_code_rev.cr_repo_id,wikidb.mw_code_rev.cr_id1Using index

+----+-------------+-----------------+--------+-----------------------------------------+------------+---------+--------------------------------------------------------+------+--------------------------+
3 rows in set (0.00 sec)

Removing the force index, and just changing the order by/group by improves it...

Coool

(In reply to comment #11)

(Of course, the query isn't scalable anyway due to COUNT(). It will have to
scan arbitrarily many rows. The only way to fully fix it would be to use a
summary table of some kind, but those are a pain to maintain, so better not to
do that unless you can't fix it some other way. It's not like we're going to
have 100 million code reviews anytime soon.)

Yes, the COUNT() should be killed. Its results aren't displayed in trunk anymore anyway. The point about the code_rev table being smallish is also true, that's why these queries aren't much of a problem in practice.

(In reply to comment #18)

Yes, the COUNT() should be killed. Its results aren't displayed in trunk
anymore anyway.

Ignore me, this is a different COUNT

So, 2 explicitally reported bugs now fixed.

Improved query for Tags in r79365 based on the above help...

Wonder what else needs fixing up...

Filtered by path and status
mysql> describe SELECT /* IndexPager::reallyDoQuery (SvnRevStatusTablePager) Reedy */ cp_rev_id,cr_id,cr_repo_id,cr_status,COUNT(DISTINCT cc_id) AS comments,cr_path,cr_message,cr_author,cr_timestamp FROM mw_code_paths INNER JOIN mw_code_rev ON ((cr_repo_id = cp_repo_id AND cr_id = cp_rev_id)) LEFT JOIN mw_code_comment ON ((cc_repo_id = cp_repo_id AND cc_rev_id = cp_rev_id)) WHERE cp_repo_id = '3' AND cp_path = '/trunk' AND cr_status = 'new' GROUP BY cp_rev_id ORDER BY cp_rev_id DESC LIMIT 51;
+----+-------------+-----------------+--------+-----------------------------------------+------------+---------+----------------------------------------------------------+-------+----------------------------------------------+

idselect_typetabletypepossible_keyskeykey_lenrefrowsExtra

+----+-------------+-----------------+--------+-----------------------------------------+------------+---------+----------------------------------------------------------+-------+----------------------------------------------+

1SIMPLEmw_code_revrefPRIMARY,cr_repo_id,cr_repo_author,cr_idPRIMARY4const45393Using where; Using temporary; Using filesort
1SIMPLEmw_code_pathseq_refPRIMARYPRIMARY265const,wikidb.mw_code_rev.cr_id,const1Using where; Using index
1SIMPLEmw_code_commentrefcc_repo_id,cc_repo_timecc_repo_id8wikidb.mw_code_paths.cp_repo_id,wikidb.mw_code_rev.cr_id1Using index

+----+-------------+-----------------+--------+-----------------------------------------+------------+---------+----------------------------------------------------------+-------+----------------------------------------------+
3 rows in set (0.00 sec)

Filtered by path

mysql> describe SELECT /* IndexPager::reallyDoQuery (SvnRevTablePager) Reedy */ cp_rev_id,cr_id,cr_repo_id,cr_status,COUNT(DISTINCT cc_id) AS comments,cr_path,cr_message,cr_author,cr_timestamp FROM mw_code_paths INNER JOIN mw_code_rev ON ((cr_repo_id = cp_repo_id AND cr_id = cp_rev_id)) LEFT JOIN mw_code_comment ON ((cc_repo_id = cp_repo_id AND cc_rev_id = cp_rev_id)) WHERE cp_repo_id = '3' AND cp_path = '/trunk' GROUP BY cp_rev_id ORDER BY cp_rev_id DESC LIMIT 51;
+----+-------------+-----------------+--------+-----------------------------------------+------------+---------+----------------------------------------------------------+-------+---------------------------------+

idselect_typetabletypepossible_keyskeykey_lenrefrowsExtra

+----+-------------+-----------------+--------+-----------------------------------------+------------+---------+----------------------------------------------------------+-------+---------------------------------+

1SIMPLEmw_code_revrefPRIMARY,cr_repo_id,cr_repo_author,cr_idPRIMARY4const45393Using temporary; Using filesort
1SIMPLEmw_code_pathseq_refPRIMARYPRIMARY265const,wikidb.mw_code_rev.cr_id,const1Using where; Using index
1SIMPLEmw_code_commentrefcc_repo_id,cc_repo_timecc_repo_id8wikidb.mw_code_paths.cp_repo_id,wikidb.mw_code_rev.cr_id1Using index

+----+-------------+-----------------+--------+-----------------------------------------+------------+---------+----------------------------------------------------------+-------+---------------------------------+
3 rows in set (0.00 sec)

Both of the above can be made "nice" by sorting and grouping by cr_id rather than cp_rev_id

Above ones fixed in r79409

Filesort:

mysql> describe SELECT /* IndexPager::reallyDoQuery (CodeStatusChangeTablePager) Reedy */ cpc_timestamp,cpc_user_text,cpc_rev_id,cr_author,cr_message,cpc_removed,cpc_added,cr_status FROM mw_code_prop_changes LEFT JOIN mw_code_rev ON ((cpc_repo_id = cr_repo_id AND cpc_rev_id = cr_id)) WHERE cpc_repo_id = '3' AND cpc_attrib = 'status' ORDER BY cpc_timestamp DESC LIMIT 51;
+----+-------------+----------------------+--------+-----------------------------------------+-------------------+---------+--------------------------------------------------------------------------------+------+-----------------------------+

idselect_typetabletypepossible_keyskeykey_lenrefrowsExtra

+----+-------------+----------------------+--------+-----------------------------------------+-------------------+---------+--------------------------------------------------------------------------------+------+-----------------------------+

1SIMPLEmw_code_prop_changesrefcpc_repo_rev_time,cpc_repo_timecpc_repo_rev_time4const10Using where; Using filesort
1SIMPLEmw_code_reveq_refPRIMARY,cr_repo_id,cr_repo_author,cr_idPRIMARY8wikidb.mw_code_prop_changes.cpc_repo_id,wikidb.mw_code_prop_changes.cpc_rev_id1

+----+-------------+----------------------+--------+-----------------------------------------+-------------------+---------+--------------------------------------------------------------------------------+------+-----------------------------+
2 rows in set (0.00 sec)

I'm guessing this one might need another index...

It's doing

cpc_repo_id, cpc_rev_id in the LEFT JOIN

And then

cpc_repo_id cpc_attrib cpc_timestamp

Also, this table doesn't have a Primary Key..

  • Changes to review metadata for a single code revision. --

CREATE TABLE /*_*/code_prop_changes (

  • Repository ID cpc_repo_id int not null,
  • Native ID number of this revision within the repository. cpc_rev_id int not null,
  • The item that was changed cpc_attrib enum('status','tags') not null,
  • How it was changed cpc_removed blob, cpc_added blob,
  • Timestamp of the change, in MediaWiki format. cpc_timestamp binary(14) not null default '',
  • User id/name of the commenter cpc_user int not null, cpc_user_text varchar(255) not null

) /*$wgDBTableOptions*/;

CREATE INDEX /*i*/cpc_repo_rev_time ON /*_*/code_prop_changes (cpc_repo_id, cpc_rev_id, cpc_timestamp);
CREATE INDEX /*i*/cpc_repo_time ON /*_*/code_prop_changes (cpc_repo_id, cpc_timestamp);

code_signoffs also doesn't have a PK... But does have a unique index...

(In reply to comment #24)

Also, this table doesn't have a Primary Key..

I don't see a unique index on this table that makes sense, TBH. The blob fields can't be indexed properly (you can only index the first N chars using cpc_added(123)) and the same change can occur multiple times.

code_signoffs also doesn't have a PK... But does have a unique index...

Good enough, AFAIK.

ayg wrote:

InnoDB tables should always have an explicit primary key. Otherwise (if there are no unique indexes) it will make a secret invisible primary key anyway, which you can't observe or control. Best to add an autoincrement primary key if there are no possible unique keys as-is.

Need to poke at this and see the state of it...

I think this is probably somewhere near done...

Marking as a fixed.. I think they're cleaned up now..