Postgresql 可以单独使用联合索引的第二字段?

mark35 发布于 2011/06/18 04:32
阅读 1K+
收藏 0

记得MYSQL的手册中说到对于多字段的联合索引是从左开始匹配,而不会对单独右侧字段进行索引扫描。比如建立 (col1,col2)的联合索引,那么当搜索条件包含col1或者同时包含col1,col2时会使用索引,而如果只包含有col2的条件则不会使用这个索引。

但是我在PG上测试发现它却会使用联合索引,但效率不如第一位置索引。

2900W的一个表,id是int4 PK,send_id、to_id 都是int4。

1、无索引查询,顺序全表漫长:

mytest=# explain analyze select count(id) from sms where send_id=1;
                                                      QUERY PLAN
-----------------------------------------------------------------------------------------------------------------------
 Aggregate  (cost=1038333.79..1038333.80 rows=1 width=4) (actual time=95677.149..95677.150 rows=1 loops=1)
   ->  Seq Scan on sms  (cost=0.00..1038315.15 rows=7456 width=4) (actual time=16546.411..95669.639 rows=9634 loops=1)
         Filter: (send_id = 1)
 Total runtime: 95677.249 ms

 

2、创建联合索引:

CREATE INDEX id_sendid_idx ON sms (id, send_id);

查询send_id字段,可以看出是使用了联合索引的:

mytest=# explain analyze select count(id) from sms where send_id=1;
                                                               QUERY PLAN
-----------------------------------------------------------------------------------------------------------------------------------------
 Aggregate  (cost=542110.25..542110.26 rows=1 width=4) (actual time=9983.587..9983.589 rows=1 loops=1)
   ->  Index Scan using id_sendid_idx on sms  (cost=0.00..542103.85 rows=2558 width=4) (actual time=296.246..9975.754 rows=9634 loops=1)
         Index Cond: (send_id = 1)
 Total runtime: 9983.683 ms

 

3、创建send_id字段单独索引:

CREATE INDEX sendid_idx ON sms (send_id );
mytest=# explain analyze select count(id) from sms where send_id=1;
                                                            QUERY PLAN
----------------------------------------------------------------------------------------------------------------------------------
 Aggregate  (cost=9829.78..9829.79 rows=1 width=4) (actual time=747.413..747.414 rows=1 loops=1)
   ->  Bitmap Heap Scan on sms  (cost=46.63..9823.38 rows=2558 width=4) (actual time=48.507..739.768 rows=9634 loops=1)
         Recheck Cond: (send_id = 1)
         ->  Bitmap Index Scan on sendid_idx  (cost=0.00..45.99 rows=2558 width=0) (actual time=13.127..13.127 rows=9634 loops=1)
               Index Cond: (send_id = 1)
 Total runtime: 747.515 ms

查询使用的是单字段索引并且速度快多了。

 

4、创建在第一位的联合索引:

CREATE INDEX sendid_id_idx ON sms(send_id, id);
mytest=# explain analyze select count(id) from sms where send_id=1;
                                                              QUERY PLAN
--------------------------------------------------------------------------------------------------------------------------------------
 Aggregate  (cost=27589.48..27589.49 rows=1 width=4) (actual time=788.562..788.563 rows=1 loops=1)
   ->  Bitmap Heap Scan on sms  (cost=145.23..27570.83 rows=7456 width=4) (actual time=81.329..780.932 rows=9634 loops=1)
         Recheck Cond: (send_id = 1)
         ->  Bitmap Index Scan on sendid_id_idx  (cost=0.00..143.36 rows=7456 width=0) (actual time=41.488..41.488 rows=9634 loops=1)
               Index Cond: (send_id = 1)
 Total runtime: 788.721 ms

比走独立索引略慢一点。

 

5、创建一个多列索引,让其处于最后位置:

DROP INDEX id_sendid_idx;
DROP INDEX sendid_idx;
CREATE INDEX id_toid_sendid_idx ON sms (id,to_id,send_id);
                                                                 QUERY PLAN

------------------------------------------------------------------------------------------------------------------------------------------
--
 Aggregate  (cost=614911.01..614911.02 rows=1 width=4) (actual time=15329.030..15329.032 rows=1 loops=1)
   ->  Index Scan using id_toid_sendid_idx on sms  (cost=0.00..614892.33 rows=7470 width=4) (actual time=1425.844..15321.157 rows=9634 loops=1
)
         Index Cond: (send_id = 1)
 Total runtime: 15329.118 ms

查询速度更慢了,不过还是使用了索引的。

 

6、再添加一个联合索引:

CREAT INDEX toid_sendid_idx ON sms (to_id, send_id);
                                                                 QUERY PLAN

------------------------------------------------------------------------------------------------------------------------------------------
---
 Aggregate  (cost=567270.61..567270.62 rows=1 width=4) (actual time=12700.940..12700.945 rows=1 loops=1)
   ->  Index Scan using toid_sendid_idx on sms  (cost=0.00..567251.93 rows=7470 width=4) (actual time=1268.600..12685.580 rows=9634 loops=
1)
         Index Cond: (send_id = 1)
 Total runtime: 12701.053 ms

规划器在有两个索引(id_toid_sendid, toid_sendid_idx)可选的情况下选择了更优的后者,并且实际速度的确快一些。

没想到PG可以对不处于多列索引首位的字段也使用这个多列索引。当然对于PG来说多列索引的优势并不如在MYSQL那么大,因为PG规划器可以智能地把涉及到多列索引内字段的条件打开变成多个单独的查询(使用字段相应的独立索引)然后再合并结果。
而MYSQL在这种情况下程序只能选择使用一个单独字段索引,所以多列索引更有意义。

 

 

加载中
0
mark35
mark35
不知道sqlserver,oracle是否也如此?
返回顶部
顶部