注册 登录  
 加关注
   显示下一条  |  关闭
温馨提示!由于新浪微博认证机制调整,您的新浪微博帐号绑定已过期,请重新绑定!立即重新绑定新浪微博》  |  关闭

PostgreSQL 中文网

 
 
 

日志

 
 

PostgreSQL 优化一例: 使用函数索引和联合索引  

2012-06-21 21:05:55|  分类: PG性能优化 |  标签: |举报 |字号 订阅

  下载LOFTER 我的照片书  |


           今天有个生产库上的语句比较慢,花费近 500 毫秒左右,为了便于测试,后来将这几个表
导到测试环境下测试,并且优化后,时间下降到仅需要 0.104 ms, 下面是优化过程。

 

--1 表结构

 francs=> \d test_count
        Table "francs.test_count"
   Column    |       Type        |   Modifiers  
-------------+-------------------+---------------
 answerid    | character varying |
 skyid       | bigint            |
 questid     | character varying |
 username    | character varying |
 id          | integer           | not null
 create_time | character varying | default now()
 flag        | integer           | default 0
Indexes:
    "v_a_c_pk_id" PRIMARY KEY, btree (id)
    "idx_test_count_ctime" btree (create_time)

francs=> \d test
                Table "francs.test"
   Column    |            Type             |   Modifiers  
-------------+-----------------------------+---------------
 answer      | character varying           |
 create_time | timestamp without time zone | default now()
 questid     | character varying(20)       |
 id          | integer                     | not null
 answercount | bigint                      | default 0
Indexes:
    "pk_ans_id" PRIMARY KEY, btree (id)
    "idx_test_ctime" btree (create_time)

   


--2 表大小和数据

 francs=> \dt+ test_count
                         List of relations
 Schema |       Name        | Type  | Owner  | Size  | Description
--------+-------------------+-------+--------+-------+-------------
 francs | test_count | table | francs | 56 MB |
(1 row)

francs=> \dt+ test
                       List of relations
 Schema |    Name     | Type  | Owner  |  Size   | Description
--------+-------------+-------+--------+---------+-------------
 francs | test | table | francs | 8920 kB |
(1 row)

francs=> select count(*) from test_count;
 count 
--------
 500455
(1 row)

francs=> select count(*) from test;
 count 
--------
 101118
(1 row)

   

  备注:两个表都不大。


--3 生产上 SQL 语句和执行计划

 francs=> explain analyze SELECT vac.skyid, vac.username, va.answer
  FROM test_count vac, test va
 WHERE vac.questid = '20110224123544'
   and vac.answerid = va.id ::character varying
   and vac.flag = 0
 ORDER BY vac.create_time DESC LIMIT 3;
                                                                                   QUERY PLAN                                      
                                            
---------------------------------------------------------------------------------------------------------------------------------------------------------------------------------
 Limit  (cost=0.00..494.32 rows=3 width=69) (actual time=81.274..420.706 rows=3 loops=1)
   ->  Nested Loop  (cost=0.00..85352.00 rows=518 width=69) (actual time=81.273..420.702 rows=3 loops=1)
         Join Filter: ((vac.answerid)::text = ((va.id)::character varying)::text)
         ->  Index Scan Backward using idx_test_count_ctime on test_count vac  (cost=0.00..20208.38 rows=24 width=56)
(actual time=80.522..263.236 rows=3 loops=1)
               Filter: (((questid)::text = '20110224123544'::text) AND (flag = 0))
         ->  Materialize  (cost=0.00..3220.77 rows=101118 width=22) (actual time=0.026..31.220 rows=67673 loops=3)
               ->  Seq Scan on test va  (cost=0.00..2122.18 rows=101118 width=22) (actual time=0.009..28.288 rows=101118 loops=1)
 Total runtime: 537.650 ms
   


   备注:这是原始的SQL,花费了537.650 ms,生产上的表导到测试库后,执行时间也这么多; 多表关联一般在
              关联字段上创建索引,而表  vac.answerid 字段上没有索引,计划在这个字段上创建索引。
        
        
--4 查看 answerid 字段的选择性

 francs=>  select count(distinct answerid),count(*)  from test_count;
 count | count 
-------+--------
 32847 | 500455
(1 row)
        
francs=> create index idx_test_count_answeid on test_count using btree ( answerid );
CREATE INDEX

francs=> analyze test_count;
ANALYZE

   

   备注:answerid 字段 选择性不错,可以创建索引。

 

--5 查看PLAN

 francs=> explain analyze SELECT vac.skyid, vac.username, va.answer
  FROM test_count vac, test va
 WHERE vac.questid = '20110224123544'
   and vac.answerid = va.id ::character varying
   and vac.flag = 0
 ORDER BY vac.create_time DESC LIMIT 3;
                                                                                   QUERY PLAN                                      
                                            
---------------------------------------------------------------------------------------------------------------------------------------------------------------------------------
 Limit  (cost=0.00..489.59 rows=3 width=69) (actual time=81.897..422.056 rows=3 loops=1)
   ->  Nested Loop  (cost=0.00..85352.00 rows=523 width=69) (actual time=81.895..422.052 rows=3 loops=1)
         Join Filter: ((vac.answerid)::text = ((va.id)::character varying)::text)
         ->  Index Scan Backward using idx_test_count_ctime on test_count vac  (cost=0.00..20208.38 rows=24 width=56) (actual time=81.131..264.123 rows=3 loops=1)
               Filter: (((questid)::text = '20110224123544'::text) AND (flag = 0))
         ->  Materialize  (cost=0.00..3220.77 rows=101118 width=22) (actual time=0.027..31.352 rows=67673 loops=3)
               ->  Seq Scan on test va  (cost=0.00..2122.18 rows=101118 width=22) (actual time=0.009..28.355 rows=101118 loops=1)
 Total runtime: 529.655 ms
   


  备注:还是走了表 test_count 上的时间索引 idx_test_count_ctime,并且 test 走了全表扫描由于表关联字段
           类型不一致,va.id 为 integer ,vac.answerid 为 character varying,这时是用不到表  test 的主键
          的,这里需要在表 test 上创建函数索引。

--6 创建函数索引        

 francs=> create index idx_test_id on test  using btree (( id::character varying));
CREATE INDEX

francs=> \d test
                Table "francs.test"
   Column    |            Type             |   Modifiers  
-------------+-----------------------------+---------------
 answer      | character varying           |
 create_time | timestamp without time zone | default now()
 questid     | character varying(20)       |
 id          | integer                     | not null
 answercount | bigint                      | default 0
Indexes:
    "pk_ans_id" PRIMARY KEY, btree (id)
    "idx_test_ctime" btree (create_time)
    "idx_test_id" btree ((id::character varying))

   


--7 创建函数索引后再次查看PALN

 francs=> explain analyze SELECT vac.skyid, vac.username, va.answer
  FROM test_count vac, test va
 WHERE vac.questid = '20110224123544'
   and vac.answerid = va.id ::character varying
   and vac.flag = 0
 ORDER BY vac.create_time DESC LIMIT 3;
                                                                                   QUERY PLAN                                      
                                            
---------------------------------------------------------------------------------------------------------------------------------------------------------------------------------
 Limit  (cost=0.00..125.03 rows=3 width=69) (actual time=87.517..269.151 rows=3 loops=1)
   ->  Nested Loop  (cost=0.00..21795.61 rows=523 width=69) (actual time=87.514..269.146 rows=3 loops=1)
         ->  Index Scan Backward using idx_test_count_ctime on test_count vac  (cost=0.00..20208.38 rows=24 width=56) (actual time=87.478..269.049 rows=3 loops=1)
               Filter: (((questid)::text = '20110224123544'::text) AND (flag = 0))
         ->  Index Scan using idx_test_id on test va  (cost=0.00..57.28 rows=506 width=22) (actual time=0.019..0.019 rows=1 loops=3)
               Index Cond: (((id)::character varying)::text = (vac.answerid)::text)
 Total runtime: 269.228 ms
(7 rows)
   

 
   备注:这时表 test 已经走索引了,此时语句已优化到了只需要 200多 毫秒,看看是否还有优化空间。
              根据上面 PLAN ,时间主要花在 idx_test_count_ctime 扫描上,花费了182 ( 269 -87 ) 毫秒 。
 

--8 尝试删除索引  idx_test_count_ctime

 francs=> drop index idx_test_count_ctime;
DROP INDEX

francs=> create index idx_test_count_questid_ctime on test_count using btree (questid,create_time desc);
CREATE INDEX

francs=> analyze test_count;
ANALYZE

francs=> \d test_count
        Table "francs.test_count"
   Column    |       Type        |   Modifiers  
-------------+-------------------+---------------
 answerid    | character varying |
 skyid       | bigint            |
 questid     | character varying |
 username    | character varying |
 id          | integer           | not null
 create_time | character varying | default now()
 flag        | integer           | default 0
Indexes:
    "v_a_c_pk_id" PRIMARY KEY, btree (id)
    "idx_test_count_answeid" btree (answerid)
    "idx_test_count_questid_ctime" btree (questid, create_time DESC)

   
    
    备注:尝试删除 create_time 索引,并在字段  (questid,create_time ) 上创建联合索引。

 

--9 再次查看 PALN

 francs=> explain analyze SELECT vac.skyid, vac.username, va.answer
  FROM test_count vac, test va
 WHERE vac.questid = '20110224123544'
   and vac.answerid = va.id ::character varying
   and vac.flag = 0
 ORDER BY vac.create_time DESC LIMIT 3;
                                                                                QUERY PLAN                                         
                                     
--------------------------------------------------------------------------------------------------------------------------------------------------------------------------
 Limit  (cost=0.00..9.36 rows=3 width=69) (actual time=0.038..0.057 rows=3 loops=1)
   ->  Nested Loop  (cost=0.00..1609.55 rows=516 width=69) (actual time=0.037..0.056 rows=3 loops=1)
         ->  Index Scan using idx_test_count_questid_ctime on test_count vac  (cost=0.00..22.32 rows=24 width=56) (actual time=0.028..0.033 rows=3 loops=1)
               Index Cond: ((questid)::text = '20110224123544'::text)
               Filter: (flag = 0)
         ->  Index Scan using idx_test_id on test va  (cost=0.00..57.28 rows=506 width=22) (actual time=0.006..0.006 rows=1 loops=3)
               Index Cond: (((id)::character varying)::text = (vac.answerid)::text)
 Total runtime: 0.104 ms
(8 rows)
   

  备注,现在仅花费了 0.104 ms,走了新建的索引  idx_test_count_questid_ctime,比最初的 500
            多毫秒,优化了近 5000 倍!
       
--10 总结

      1 上面仅是在测试库上做的测试,优化后的 SQL 从原来的 537.650 ms 优化到  0.104 ms ,优化了近 5000倍,

          但上生产前还需要和开发人员沟通,看看索引是否会影响到其它 SQL。

      2  这里由于关联字段类型不一致,所以创建了一个函数索引,一般在应用中,关联字段的类型是一致的。

      3  近期项目中低性能 SQL 语句较多( 500 毫秒以上),应该引起足够重视,争取每天都能做下优化。

      4  在 order by 语句中,排序字段建议创建索引,但具体建法要视情况而定( 单列索引或者组合索引)。

          

  评论这张
 
阅读(31490)| 评论(10)
推荐 转载

历史上的今天

在LOFTER的更多文章

评论

<#--最新日志,群博日志--> <#--推荐日志--> <#--引用记录--> <#--博主推荐--> <#--随机阅读--> <#--首页推荐--> <#--历史上的今天--> <#--被推荐日志--> <#--上一篇,下一篇--> <#-- 热度 --> <#-- 网易新闻广告 --> <#--右边模块结构--> <#--评论模块结构--> <#--引用模块结构--> <#--博主发起的投票-->
 
 
 
 
 
 
 
 
 
 
 
 
 
 

页脚

网易公司版权所有 ©1997-2016