注册 登录  
 加关注
   显示下一条  |  关闭
温馨提示!由于新浪微博认证机制调整,您的新浪微博帐号绑定已过期,请重新绑定!立即重新绑定新浪微博》  |  关闭

PostgreSQL 中文网

 
 
 

日志

 
 

Postgresql 恢复一例  

2010-09-18 10:49:07|  分类: PG案例分析 |  标签: |举报 |字号 订阅

  下载LOFTER 我的照片书  |

   
     今天在将pg_dump压缩过的dump文件,通过pg_restore 导入到测试库时,中途异常中断,造成测试库Server
donw 机,而且之后数据无法启动。

-- 数据库异常down机的 csvlog
2010-09-17 17:28:03.943 CST,"mydb","mydb",23936,"192.168.1.25:58855",4c9334a3.5d80,1,"/opt/pgsql/bin/postgres",2010-09-17 17:28:03 CST,,0,FATAL,57P03,"the database system is in recovery mode",,,,,,,,
2010-09-17 17:28:03.944 CST,"mydb","mydb",23937,"192.168.1.25:58856",4c9334a3.5d81,1,"/opt/pgsql/bin/postgres",2010-09-17 17:28:03 CST,,0,FATAL,57P03,"the database system is in recovery mode",,,,,,,,
2010-09-17 17:28:03.954 CST,"mydb","mydb",23938,"192.168.1.25:58857",4c9334a3.5d82,1,"/opt/pgsql/bin/postgres",2010-09-17 17:28:03 CST,,0,FATAL,57P03,"the database system is in recovery mode",,,,,,,,
2010-09-17 17:28:03.955 CST,"mydb","mydb",23939,"192.168.1.25:58858",4c9334a3.5d83,1,"/opt/pgsql/bin/postgres",2010-09-17 17:28:03 CST,,0,FATAL,57P03,"the database system is in recovery mode",,,,,,,,
2010-09-17 17:28:03.956 CST,"mydb","mydb",23940,"192.168.1.25:58859",4c9334a3.5d84,1,"/opt/pgsql/bin/postgres",2010-09-17 17:28:03 CST,,0,FATAL,57P03,"the database system is in recovery mode",,,,,,,,
2010-09-17 17:28:04.032 CST,,,32240,,4c932f7d.7df0,9,,2010-09-17 17:06:05 CST,,0,FATAL,53100,"could not write to file ""pg_xlog/xlogtemp.32240"": No space left on device",,,,,,,,
2010-09-17 17:28:04.098 CST,,,946,,4c9321d4.3b2,5,,2010-09-17 16:07:48 CST,,0,LOG,00000,"startup process (PID 32240) exited with exit code 1",,,,,,,,
2010-09-17 17:28:04.098 CST,,,946,,4c9321d4.3b2,6,,2010-09-17 16:07:48 CST,,0,LOG,00000,"aborting startup due to startup process failure",,,,,,,,
     从数据库down机前的 csvlog日志来看,down机前数据库SERVER 处于 recovery mode , 说明pg_restore时 SERVER此时有异常,
之后尝试重新起动Server,命令如下
[postgres@PG1 bin]$ pg_ctl -D $PGDATA start
server starting
   
    虽然显示'server starting', 但数据库并没有真正起来,因为这时数据库根本无法连接,接着查看 csvlog
   
--数据库启动异常时 csvlog
2010-09-17 17:36:36.660 CST,"mydb","mydb",24245,"192.168.169.42:42566",4c9336a4.5eb5,1,"/opt/pgsql/bin/postgres",2010-09-17 17:36:36 CST,,0,FATAL,57P03,"the database system is starting up",,,,,,,,
2010-09-17 17:36:36.662 CST,"mydb","mydb",24247,"192.168.169.42:42567",4c9336a4.5eb7,1,"/opt/pgsql/bin/postgres",2010-09-17 17:36:36 CST,,0,FATAL,57P03,"the database system is starting up",,,,,,,,
2010-09-17 17:36:36.663 CST,"mydb","mydb",24246,"192.168.1.25:33223",4c9336a4.5eb6,1,"/opt/pgsql/bin/postgres",2010-09-17 17:36:36 CST,,0,FATAL,57P03,"the database system is starting up",,,,,,,,
2010-09-17 17:36:36.663 CST,"mydb","mydb",24248,"192.168.169.42:42568",4c9336a4.5eb8,1,"/opt/pgsql/bin/postgres",2010-09-17 17:36:36 CST,,0,FATAL,57P03,"the database system is starting up",,,,,,,,
2010-09-17 17:36:36.665 CST,"mydb","mydb",24249,"192.168.169.42:42569",4c9336a4.5eb9,1,"/opt/pgsql/bin/postgres",2010-09-17 17:36:36 CST,,0,FATAL,57P03,"the database system is starting up",,,,,,,,
2010-09-17 17:36:36.666 CST,,,24163,,4c9336a3.5e63,1,,2010-09-17 17:36:35 CST,,0,LOG,00000,"startup process (PID 24165) exited with exit code 1",,,,,,,,
2010-09-17 17:36:36.666 CST,,,24163,,4c9336a3.5e63,2,,2010-09-17 17:36:35 CST,,0,LOG,00000,"aborting startup due to startup process failure",,,,,,,,

     从上面日志来看,可以看出Server 数据库正在启动(the database system is starting up"), 但是到后面就异常中上了;
而且也没有多余的信息,由于事情紧迫,随即向师傅请教,师傅说数据库需要恢复,可以恢复到具体的时间点。以下是解决过程。    
  
--解决过程
[postgres@PG1 pg_log]$ cd $PGHOME
[postgres@PG1 pgsql]$ ll
total 16K
drwxr-xr-x 2 postgres postgres 4.0K Sep  2 18:43 bin
drwxr-xr-x 6 postgres postgres 4.0K Aug 26 16:59 include
drwxr-xr-x 3 postgres postgres 4.0K Sep  3 18:55 lib
drwxr-xr-x 8 postgres postgres 4.0K Sep  1 17:27 share
[postgres@PG1 pgsql]$ cd bin
[postgres@PG1 bin]$ ll
total 8.1M
-rwxr-xr-x 1 postgres postgres  54K Aug 26 16:59 clusterdb
-rwxr-xr-x 1 postgres postgres  54K Aug 26 16:59 createdb
-rwxr-xr-x 1 postgres postgres  58K Aug 26 16:59 createlang
-rwxr-xr-x 1 postgres postgres  54K Aug 26 16:59 createuser
-rwxr-xr-x 1 postgres postgres  50K Aug 26 16:59 dropdb
-rwxr-xr-x 1 postgres postgres  58K Aug 26 16:59 droplang
-rwxr-xr-x 1 postgres postgres  50K Aug 26 16:59 dropuser
-rwxr-xr-x 1 postgres postgres 616K Aug 26 16:59 ecpg
-rwxr-xr-x 1 postgres postgres  75K Aug 26 16:59 initdb
-rwxr-xr-x 1 postgres postgres  27K Aug 26 16:59 pg_config
-rwxr-xr-x 1 postgres postgres  26K Aug 26 16:59 pg_controldata
-rwxr-xr-x 1 postgres postgres  37K Aug 26 16:59 pg_ctl
-rwxr-xr-x 1 postgres postgres 273K Aug 26 16:59 pg_dump
-rwxr-xr-x 1 postgres postgres  71K Aug 26 16:59 pg_dumpall
-rwxr-xr-x 1 postgres postgres  35K Aug 26 16:59 pg_resetxlog
-rwxr-xr-x 1 postgres postgres 125K Aug 26 16:59 pg_restore
-rwxr-xr-x 1 postgres postgres 619K Sep  2 18:43 pgsql2shp
-rwxr-xr-x 1 postgres postgres 4.8M Aug 26 16:59 postgres
lrwxrwxrwx 1 postgres postgres    8 Aug 26 16:59 postmaster -> postgres
-rwxr-xr-x 1 postgres postgres 337K Aug 26 16:59 psql
-rwxr-xr-x 1 postgres postgres  54K Aug 26 16:59 reindexdb
-rwxr-xr-x 1 postgres postgres 621K Sep  2 18:43 shp2pgsql
-rwxr-xr-x 1 postgres postgres  32K Aug 26 16:59 vacuumdb

--pg_controldata 查看PG SERVER 详细信息
[postgres@PG1 bin]$ ./pg_controldata $PGDATA
pg_control version number:            843
Catalog version number:               200904091
Database system identifier:           5509641090052341117
Database cluster state:               shut down
pg_control last modified:             Fri 17 Sep 2010 05:28:03 PM CST
Latest checkpoint location:           137/BFFFF68
Prior checkpoint location:            131/9D90C818
Latest checkpoint's REDO location:    137/BFFFF68
Latest checkpoint's TimeLineID:       1
Latest checkpoint's NextXID:          0/64282
Latest checkpoint's NextOID:          215491390
Latest checkpoint's NextMultiXactId:  1
Latest checkpoint's NextMultiOffset:  0
Time of latest checkpoint:            Fri 17 Sep 2010 05:26:05 PM CST
Minimum recovery ending location:     0/0
Maximum data alignment:               8
Database block size:                  8192
Blocks per segment of large relation: 1048576
WAL block size:                       65536
Bytes per WAL segment:                67108864
Maximum length of identifiers:        64
Maximum columns in an index:          32
Maximum size of a TOAST chunk:        1996
Date/time type storage:               64-bit integers
Float4 argument passing:              by value
Float8 argument passing:              by value

 重要的信息: Latest checkpoint's NextXID: 0/64282, "Latest checkpoint's NextXID"
是指最近一次安全的checkpoints的下一个事务ID,我们可以将数据库恢复到这一时刻。


-- 通过pg_resetxlog 将数据库恢复到事务 64282 时刻

[postgres@PG1 bin]$ pg_resetxlog --help
pg_resetxlog resets the PostgreSQL transaction log.

Usage:
  pg_resetxlog [OPTION]... DATADIR

Options:
  -e XIDEPOCH     set next transaction ID epoch
  -f              force update to be done
  -l TLI,FILE,SEG force minimum WAL starting location for new transaction log
  -m XID          set next multitransaction ID
  -n              no update, just show extracted control values (for testing)
  -o OID          set next OID
  -O OFFSET       set next multitransaction offset
  -x XID          set next transaction ID
  --help          show this help, then exit
  --version       output version information, then exit

Report bugs to <pgsql-bugs@postgresql.org>.

[postgres@PG1 bin]$ pg_resetxlog -x 64282 $PGDATA
Transaction log reset


--再次启动 PG SERVER,正常,此时数据库已恢复
[postgres@PG1 bin]$ pg_ctl -D $PGDATA start
server starting
[postgres@PG1 bin]$ ps -ef | grep post
postgres 25297     1 14 18:08 pts/1    00:00:00 /opt/pgsql/bin/postgres -D /opt/pgdata/pg_root
postgres 25298 25297  0 18:08 ?        00:00:00 postgres: logger process                     
postgres 25829 25297  0 18:08 ?        00:00:00 postgres: writer process                     
postgres 25830 25297  0 18:08 ?        00:00:00 postgres: wal writer process                 
postgres 25831 25297  0 18:08 ?        00:00:00 postgres: stats collector process   

到了这里,数据库PG SERVER 终于可以启来了,这里顺便说一下,之前的 pg_restore 中途异常是因为脚本中加了
-j 参数,同时跑多个 pg_restore 线程, 造成 pg_restore 子线程连接丢失,pg_restore 脚本中去掉 -j
参数时,数据库顺利导入。         

--pg_controldata 官网文档介绍
Name
pg_controldata — display control information of a PostgreSQL database cluster

Synopsis
pg_controldata [datadir]

Description
pg_controldata prints information initialized during initdb, such as the catalog version. It also shows information about write-ahead logging and checkpoint processing. This information is cluster-wide, and not specific to any one database.
This utility can only be run by the user who initialized the cluster because it requires read access to the data directory. You can specify the data directory on the command line, or use the environment variable PGDATA.

Environment
PGDATA
Default data directory location

--pg_resetxlog 官网文档
pg_resetxlog — reset the write-ahead log and other control information of a PostgreSQL database cluster

  评论这张
 
阅读(31158)| 评论(0)
推荐 转载

历史上的今天

在LOFTER的更多文章

评论

<#--最新日志,群博日志--> <#--推荐日志--> <#--引用记录--> <#--博主推荐--> <#--随机阅读--> <#--首页推荐--> <#--历史上的今天--> <#--被推荐日志--> <#--上一篇,下一篇--> <#-- 热度 --> <#-- 网易新闻广告 --> <#--右边模块结构--> <#--评论模块结构--> <#--引用模块结构--> <#--博主发起的投票-->
 
 
 
 
 
 
 
 
 
 
 
 
 
 

页脚

网易公司版权所有 ©1997-2016