注册 登录  
 加关注
   显示下一条  |  关闭
温馨提示!由于新浪微博认证机制调整,您的新浪微博帐号绑定已过期,请重新绑定!立即重新绑定新浪微博》  |  关闭

PostgreSQL 中文网

 
 
 

日志

 
 

Killed a query failed When using pg_termniate_backend  

2011-04-26 18:56:13|  分类: PG案例分析 |  标签: |举报 |字号 订阅

  下载LOFTER 我的照片书  |

            
      
       今天生产库上有个会话(Select 操作) 引起了我的注意,都运行两天了,还没结束,
这个SQL关联查询两张表的数据,具体信息如下

--1 数据库,OS信息
PostgreSQL版本: 8.4.3
OS: Red Hat Enterprise Linux Server release 4.5

--2 查询数据库活动的会话
db_skytf=#  select procpid, usename,query_start,client_addr, client_port ,current_query ,waiting from pg_stat_activity where current_query !='<IDLE>' ;
 procpid | usename  |          query_start          |  client_addr  | client_port |                                                
               current_query                                                                | waiting
---------+----------+-------------------------------+---------------+-------------+-------------------------------------------------
--------------------------------------------------------------------------------------------+---------
   15583 | postgres | 2011-04-26 15:35:48.909562+08 |               |          -1 | select procpid, usename,query_start,client_addr,
 client_port ,current_query ,waiting from pg_stat_activity where current_query !='<IDLE>' ; | f
    1312 | skytf  | 2011-04-24 13:59:49.318178+08 | 192.168.x.xxx |       34621 | SELECT                                         
                                                                                            | f
                                                                                  : mrp.*,res.name as displayname                                                                                                               
                                                                                  : FROM                                                                                                                      
                                                                                  : skytf.tbl_app_mrp as mrp,tbl_app_resource as res                                                                                           
                                                                                  : where mrp.appid=res.appid and                                                                                             
                                                                                  : mrp.status='1' and res.status='1'                                                                                    
                                                                                  : order by id desc                               
                                                                                             
(2 rows)

        备注:反复执行上述查询几次,正是会话 1312 一直在执行, 根据 query_start,可以看出这个SQL
   已经跑了两天了。下面看下相关表信息,并看下这个SQL的执行计划。


--3 查看表大小
db_skytf=> \dt+ tbl_app_mrp
                       List of relations
 Schema  |    Name     | Type  |  Owner  |  Size  | Description
---------+-------------+-------+---------+--------+-------------
 skytf | tbl_app_mrp | table | skytf | 520 MB |
(1 row)

db_skytf=> \dt+ tbl_app_resource
                         List of relations
 Schema  |       Name       | Type  |  Owner  | Size  | Description
---------+------------------+-------+---------+-------+-------------
 skytf | tbl_app_resource | table | skytf | 24 kB |
(1 row)

     从上面看出表不大。

--4 查看执行计划
db_skytf=> explain  SELECT                                                                                 
db_skytf->  mrp.*,res.name as displayname                                                          
db_skytf-> FROM     skytf.tbl_app_mrp as mrp,tbl_app_resource as res                                     
db_skytf->  where mrp.appid=res.appid and                                                          
db_skytf->   mrp.status='1' and res.status='1'  ;
                                    QUERY PLAN                                    
-----------------------------------------------------------------------------------
 Hash Join  (cost=7.56..145797.09 rows=2839643 width=141)
   Hash Cond: (mrp.appid = res.appid)
   ->  Seq Scan on tbl_app_mrp mrp  (cost=0.00..106744.44 rows=2839643 width=127)
         Filter: (status = '1'::bpchar)
   ->  Hash  (cost=5.29..5.29 rows=182 width=18)
         ->  Seq Scan on tbl_app_resource res  (cost=0.00..5.29 rows=182 width=18)
               Filter: (status = '1'::bpchar)
(7 rows)

db_skytf=> explain analyze  SELECT                                                                                 
db_skytf->  mrp.*,res.name as displayname                                                          
db_skytf-> FROM     skytf.tbl_app_mrp as mrp,tbl_app_resource as res                                     
db_skytf->  where mrp.appid=res.appid and                                                          
db_skytf->   mrp.status='1' and res.status='1'  ;
                                                             QUERY PLAN                                                            
 
------------------------------------------------------------------------------------------------------------------------------------
-
 Hash Join  (cost=7.56..145797.09 rows=2839643 width=141) (actual time=0.270..3211.741 rows=2799308 loops=1)
   Hash Cond: (mrp.appid = res.appid)
   ->  Seq Scan on tbl_app_mrp mrp  (cost=0.00..106744.44 rows=2839643 width=127) (actual time=0.043..1896.213 rows=2799308 loops=1)
         Filter: (status = '1'::bpchar)
   ->  Hash  (cost=5.29..5.29 rows=182 width=18) (actual time=0.187..0.187 rows=184 loops=1)
         ->  Seq Scan on tbl_app_resource res  (cost=0.00..5.29 rows=182 width=18) (actual time=0.017..0.112 rows=184 loops=1)
               Filter: (status = '1'::bpchar)
 Total runtime: 3455.818 ms
(8 rows)
 
       备注:从PLAN执行情况来看,执行时间还快的,3 秒钟左右就跑完了,但为什么 1312 的查询
都跑了两天了,还没跑完,猜测进程可能僵死,于是决定将它Kill。


--5 删除会话
db_skytf=# select pg_cancel_backend(1312);
 pg_cancel_backend
-------------------
 t
(1 row)

   起初,使用 pg_cancel_backend, 返回结果为 true, 但发现 1312 会话依然还在,
下面接着使用 pg_terminate_backend 看下效果。


db_skytf=# select pg_terminate_backend(1312);
 pg_terminate_backend
----------------------
 t
(1 row)


--6 再次查询数据库活动的会话
db_skytf=#  select procpid, usename,query_start,client_addr, client_port ,current_query ,waiting from pg_stat_activity where current_query !='<IDLE>' ;
 procpid | usename  |          query_start          |  client_addr  | client_port |                                                
               current_query                                                                | waiting
---------+----------+-------------------------------+---------------+-------------+-------------------------------------------------
--------------------------------------------------------------------------------------------+---------
   15583 | postgres | 2011-04-26 15:35:48.909562+08 |               |          -1 | select procpid, usename,query_start,client_addr,
 client_port ,current_query ,waiting from pg_stat_activity where current_query !='<IDLE>' ; | f
    1312 | skytf  | 2011-04-24 13:59:49.318178+08 | 192.168.x.xxx |       34621 | SELECT                                         
                                                                                            | f
                                                                                  : mrp.*,res.name as displayname                                                                                                               
                                                                                  : FROM                                                                                                                      
                                                                                  : skytf.tbl_app_mrp as mrp,tbl_app_resource as res                                                                                           
                                                                                  : where mrp.appid=res.appid and                                                                                             
                                                                                  : mrp.status='1' and res.status='1'                                                                                    
                                                                                  : order by id desc     
   
         使用 pg_terminate_backend 后,虽然结果也返回 t,但会话 1312 依然没被Kill,
好顽强的会话啊,要是这个库是 Oracle, 俺早使用 kill -9 结果它了,但这是 PostgreSQL
使用 kill -9 后果很严重。。。


--7 从OS层面分析
postgres@skytf-db-> ps -ef | grep 1312;
postgres  1312  3949  0 Jan01 ?        00:41:55 postgres: skytf db_skytf 192.168.x.xxx(34621) SELECT
postgres 15354 15105  0 15:16 pts/4    00:00:00 grep 1312

    备注:怀疑 1312 进程可能已经僵死,于是让应用人员检查。

--8 应用主机(192.168.x.xxx)查看进程
netstat -anp|grep 34621
tcp   3216507      0 ::ffff:192.168.x.xxx:34621  ::ffff:192.168.1.102:1921   ESTABLISHED 4187/java  

    备注:应用人员发现些进程无异常,后来建议将这进程杀掉,在通过应用人员
将这个进程杀掉后, 1312 会话终于被干掉了。


--9 总结
    有些SQL在通过 pg_cancel_backend 和 pg_terminate_backen 也未能 Kill 的情况下,建议联系应用人员 kill
对应的应用进程。

  评论这张
 
阅读(26056)| 评论(0)
推荐 转载

历史上的今天

在LOFTER的更多文章

评论

<#--最新日志,群博日志--> <#--推荐日志--> <#--引用记录--> <#--博主推荐--> <#--随机阅读--> <#--首页推荐--> <#--历史上的今天--> <#--被推荐日志--> <#--上一篇,下一篇--> <#-- 热度 --> <#-- 网易新闻广告 --> <#--右边模块结构--> <#--评论模块结构--> <#--引用模块结构--> <#--博主发起的投票-->
 
 
 
 
 
 
 
 
 
 
 
 
 
 

页脚

网易公司版权所有 ©1997-2016