Page MenuHomePhabricator

Additional pagelinks_title index improves performance on PostgreSQL (WhatLinksHerePage::showIndirectLinks)
Closed, ResolvedPublic

Description

Author: Bernhard.Fastenrath

Description:
"create index pagelinks_title on pagelinks (pl_title);" reduces the query time on PostgreSQL significantly.

Query: EXPLAIN ANALYZE SELECT /* WhatLinksHerePage::showIndirectLinks 192.168.1.7 */ page_id,page_namespace,page_title,page_is_redirect FROM pagelinks,page WHERE (page_id=pl_from) AND pl_namespace = '0' AND pl_title = 'Wikipedia' ORDER BY pl_from LIMIT 500;

Query plan with pagelinks_title index:

Limit (cost=5251.93..5252.96 rows=410 width=30) (actual time=8.630..8.933 rows=500 loops=1)

->  Sort  (cost=5251.93..5252.96 rows=410 width=30) (actual time=8.629..8.736 rows=500 loops=1)
      Sort Key: pagelinks.pl_from
      Sort Method:  top-N heapsort  Memory: 53kB
      ->  Nested Loop  (cost=0.00..5234.14 rows=410 width=30) (actual time=0.052..7.619 rows=1346 loops=1)
            ->  Index Scan using pagelinks_title on pagelinks  (cost=0.00..1817.52 rows=410 width=4) (actual time=0.044..1.488 rows=13

46 loops=1)

      Index Cond: (pl_title = 'Wikipedia'::text)
      Filter: (pl_namespace = 0::smallint)
->  Index Scan using page_pkey on page  (cost=0.00..8.32 rows=1 width=26) (actual time=0.003..0.003 rows=1 loops=1346)
      Index Cond: (page.page_id = pagelinks.pl_from)

Total runtime: 9.096 ms

Query plan without pagelinks_title index:

Limit (cost=1097356.67..1097357.70 rows=410 width=30) (actual time=71542.699..71542.996 rows=500 loops=1)

->  Sort  (cost=1097356.67..1097357.70 rows=410 width=30) (actual time=71542.698..71542.810 rows=500 loops=1)
      Sort Key: pagelinks.pl_from
      Sort Method:  top-N heapsort  Memory: 53kB
      ->  Nested Loop  (cost=0.00..1097338.88 rows=410 width=30) (actual time=5.803..71539.392 rows=1346 loops=1)
            ->  Seq Scan on pagelinks  (cost=0.00..1093922.26 rows=410 width=4) (actual time=5.787..71522.455 rows=1346 loops=1)
                  Filter: ((pl_namespace = 0::smallint) AND (pl_title = 'Wikipedia'::text))
            ->  Index Scan using page_pkey on page  (cost=0.00..8.32 rows=1 width=26) (actual time=0.008..0.009 rows=1 loops=1346)
                  Index Cond: (page.page_id = pagelinks.pl_from)

Total runtime: 71543.179 ms


Version: 1.16.x
Severity: enhancement

Details

Reference
bz25111

Event Timeline

bzimport raised the priority of this task from to Medium.Nov 21 2014, 11:15 PM
bzimport set Reference to bz25111.
bzimport added a subscriber: Unknown Object (MLST).

Should be done everywhere if results are the same

(In reply to comment #1)

Should be done everywhere if results are the same

If by 'everywhere' you mean other DBMSes, no. At least on MySQL this won't improve anything.

It's possible we could rearrange the existing unique index to put pl_title first, but indexes are cheap, and other things may rely on pl_from being first, so I'm just going to add this new index in. Thanks for the report.