注册 登录  
 加关注
   显示下一条  |  关闭
温馨提示!由于新浪微博认证机制调整,您的新浪微博帐号绑定已过期,请重新绑定!立即重新绑定新浪微博》  |  关闭

PostgreSQL 中文网

 
 
 

日志

 
 

MongoDB:Replica Set 节点切换和 failover  

2012-11-28 15:45:43|  分类: MongoDB |  标签: |举报 |字号 订阅

  下载LOFTER 我的照片书  |

   
            前面学习了 Replica Set 的搭建和从节点的添加,删除过程,接下来学习
Replica Set 节点的切换以及 failover 相关的内容。

       Replica Set 节点切换是指当出现故障或者出于维护需要,需要将主节点切换到
另一台从节点,例如主节点主机需要硬件扩容时,那么需要停主节点主机, Replica
Set 支持节点成员切换,其中使用的是投票竞选机制,对于投票竞选机制看了下文档,
还不是非常明白,但主要是通过 priority 参数来控制的,接下来演示下。

 

一 节点切换
--1.1 基础环境 ( 三节点 )

 rs0:PRIMARY> rs.status();
{
        "set" : "rs0",
        "date" : ISODate("2012-11-27T16:38:15Z"),
        "myState" : 1,
        "members" : [
                {
                        "_id" : 0,
                        "name" : "redhatB.example.com:27018",
                        "health" : 1,
                        "state" : 2,
                        "stateStr" : "SECONDARY",
                        "uptime" : 9143,
                        "optime" : Timestamp(1354025540000, 1),
                        "optimeDate" : ISODate("2012-11-27T14:12:20Z"),
                        "lastHeartbeat" : ISODate("2012-11-27T16:38:15Z"),
                        "pingMs" : 0
                },
                {
                        "_id" : 1,
                        "name" : "redhatB.example.com:27019",
                        "health" : 1,
                        "state" : 2,
                        "stateStr" : "SECONDARY",
                        "uptime" : 8733,
                        "optime" : Timestamp(1354025540000, 1),
                        "optimeDate" : ISODate("2012-11-27T14:12:20Z"),
                        "lastHeartbeat" : ISODate("2012-11-27T16:38:15Z"),
                        "pingMs" : 0
                },
                {
                        "_id" : 2,
                        "name" : "redhatB.example.com:27020",
                        "health" : 1,
                        "state" : 1,
                        "stateStr" : "PRIMARY",
                        "uptime" : 462929,
                        "optime" : Timestamp(1354025540000, 1),
                        "optimeDate" : ISODate("2012-11-27T14:12:20Z"),
                        "self" : true
                }
        ],
        "ok" : 1
}
   

备注:三节点 replica set 环境,其中最后一个节点为主节点,假如主节点由于某种原因需要切换,
         假设需要切换到从节点"redhatB.example.com:27018",可以通过设置节点 priority 参数来
         实现,节点的 priority 值越大,切换时的优先级越高。


--1.2 设置节点的 priority 值

 cfg = rs.conf()
cfg.members[0].priority = 2
cfg.members[1].priority = 1
cfg.members[2].priority = 0.5
rs.reconfig(cfg)
   


--1.3 操作日志

 rs0:PRIMARY> cfg.members[0].priority = 2cfg.members[0].priority = 2
2
rs0:PRIMARY> cfg.members[1].priority = 1cfg.members[1].priority = 1
1
rs0:PRIMARY> cfg.members[2].priority = 0.5cfg.members[2].priority = 0.5
0.5
rs0:PRIMARY> rs.reconfig(cfg)
Wed Nov 28 00:57:30 DBClientCursor::init call() failed
Wed Nov 28 00:57:30 query failed : admin.$cmd { replSetReconfig: { _id: "rs0", version: 9, members: [ { _id: 0, host: "redhatB.example.com:27018", priority: 2.0 }, { _id: 1, host: "redhatB.example.com:27019", priority: 1.0 }, { _id: 2, host: "redhatB.example.com:27020", priority: 0.5 } ] } } to: 127.0.0.1:27020
Wed Nov 28 00:57:30 trying reconnect to 127.0.0.1:27020
Wed Nov 28 00:57:30 reconnect 127.0.0.1:27020 ok
reconnected to server after rs command (which is normal)
   备注:设置好从节点的 priority 值后,调用一次 rs.reconfig 操作会导致当前主库中断,因为
              需要重新执行主节点竞选,大概几十秒后,新的主节点产生。

 

--1.4 重新查看状态

 rs0:SECONDARY> rs.conf();
{
        "_id" : "rs0",
        "version" : 9,
        "members" : [
                {
                        "_id" : 0,
                        "host" : "redhatB.example.com:27018",
                        "priority" : 2
                },
                {
                        "_id" : 1,
                        "host" : "redhatB.example.com:27019"
                },
                {
                        "_id" : 2,
                        "host" : "redhatB.example.com:27020",
                        "priority" : 0.5
                }
        ]
}
   备注:这时可以看到各节点优先级 priority 的值,节点默认的 priority  值为 1,不显示。

 

--1.5 再次查看节点状态

 rs0:SECONDARY> rs.status();
{
        "set" : "rs0",
        "date" : ISODate("2012-11-27T16:58:27Z"),
        "myState" : 2,
        "syncingTo" : "redhatB.example.com:27018",
        "members" : [
                {
                        "_id" : 0,
                        "name" : "redhatB.example.com:27018",
                        "health" : 1,
                        "state" : 1,
                        "stateStr" : "PRIMARY",
                        "uptime" : 47,
                        "optime" : Timestamp(1354035450000, 1),
                        "optimeDate" : ISODate("2012-11-27T16:57:30Z"),
                        "lastHeartbeat" : ISODate("2012-11-27T16:58:26Z"),
                        "pingMs" : 0
                },
                {
                        "_id" : 1,
                        "name" : "redhatB.example.com:27019",
                        "health" : 1,
                        "state" : 2,
                        "stateStr" : "SECONDARY",
                        "uptime" : 47,
                        "optime" : Timestamp(1354035450000, 1),
                        "optimeDate" : ISODate("2012-11-27T16:57:30Z"),
                        "lastHeartbeat" : ISODate("2012-11-27T16:58:26Z"),
                        "pingMs" : 0
                },
                {
                        "_id" : 2,
                        "name" : "redhatB.example.com:27020",
                        "health" : 1,
                        "state" : 2,
                        "stateStr" : "SECONDARY",
                        "uptime" : 464141,
                        "optime" : Timestamp(1354035450000, 1),
                        "optimeDate" : ISODate("2012-11-27T16:57:30Z"),
                        "errmsg" : "syncing to: redhatB.example.com:27018",
                        "self" : true
                }
        ],
        "ok" : 1
}
   备注:此时主节点已漂移到节点 "redhatB.example.com:27018",达到目标。

 

二  测试 Failover

       MongoDB 的 replica set 特性支持自动 failover,当主节点由于某种原因掉线时,replica set
的其它节点可通过竞选产生新的主节点,接下来测试下这个特性。

--2.1 基础信息

 rs0:PRIMARY> rs.status();
{
        "set" : "rs0",
        "date" : ISODate("2012-11-27T17:50:53Z"),
        "myState" : 1,
        "members" : [
                {
                        "_id" : 0,
                        "name" : "redhatB.example.com:27018",
                        "health" : 1,
                        "state" : 1,
                        "stateStr" : "PRIMARY",
                        "uptime" : 200,
                        "optime" : Timestamp(1354037833000, 1),
                        "optimeDate" : ISODate("2012-11-27T17:37:13Z"),
                        "self" : true
                },
                {
                        "_id" : 1,
                        "name" : "redhatB.example.com:27019",
                        "health" : 1,
                        "state" : 2,
                        "stateStr" : "SECONDARY",
                        "uptime" : 108,
                        "optime" : Timestamp(1354037833000, 1),
                        "optimeDate" : ISODate("2012-11-27T17:37:13Z"),
                        "lastHeartbeat" : ISODate("2012-11-27T17:50:52Z"),
                        "pingMs" : 0,
                        "errmsg" : "syncing to: redhatB.example.com:27018"
                },
                {
                        "_id" : 2,
                        "name" : "redhatB.example.com:27020",
                        "health" : 1,
                        "state" : 2,
                        "stateStr" : "SECONDARY",
                        "uptime" : 35,
                        "optime" : Timestamp(1354037833000, 1),
                        "optimeDate" : ISODate("2012-11-27T17:37:13Z"),
                        "lastHeartbeat" : ISODate("2012-11-27T17:50:52Z"),
                        "pingMs" : 0,
                        "errmsg" : "syncing to: redhatB.example.com:27018"
                }
        ],
        "ok" : 1
}
   
  备注:此时的主节点为 "redhatB.example.com:27018"。
 
 
--2.2 异常关闭主节点。

 [mongo@redhatB data03]$ ps -ef | grep 27018
mongo      879     1  0 01:47 ?        00:00:02 mongod -f /pgdata_xc/mongodb/data01/mongodb_27018.conf
mongo     1250 21805  0 01:50 pts/0    00:00:00 mongo 127.0.0.1:27018
mongo     1296 23173  0 01:51 pts/1    00:00:00 grep 27018
[mongo@redhatB data03]$ kill -9 879
   备注:过一会后,节点 "redhatB.example.com:27019" 竞选成为新主节点。

 

--2.3 查看节点状态

 [mongo@redhatB ~]$ mongo 127.0.0.1:27019
MongoDB shell version: 2.2.1
connecting to: 127.0.0.1:27019/test

rs0:PRIMARY> rs.conf();
{
        "_id" : "rs0",
        "version" : 9,
        "members" : [
                {
                        "_id" : 0,
                        "host" : "redhatB.example.com:27018",
                        "priority" : 2
                },
                {
                        "_id" : 1,
                        "host" : "redhatB.example.com:27019"
                },
                {
                        "_id" : 2,
                        "host" : "redhatB.example.com:27020",
                        "priority" : 0.5
                }
        ]
}
rs0:PRIMARY> rs.status();
{
        "set" : "rs0",
        "date" : ISODate("2012-11-27T17:52:41Z"),
        "myState" : 1,
        "members" : [
                {
                        "_id" : 0,
                        "name" : "redhatB.example.com:27018",
                        "health" : 0,
                        "state" : 8,
                        "stateStr" : "(not reachable/healthy)",
                        "uptime" : 0,
                        "optime" : Timestamp(1354037833000, 1),
                        "optimeDate" : ISODate("2012-11-27T17:37:13Z"),
                        "lastHeartbeat" : ISODate("2012-11-27T17:52:01Z"),
                        "pingMs" : 0,
                        "errmsg" : "socket exception [CONNECT_ERROR] for redhatB.example.com:27018"
                },
                {
                        "_id" : 1,
                        "name" : "redhatB.example.com:27019",
                        "health" : 1,
                        "state" : 1,
                        "stateStr" : "PRIMARY",
                        "uptime" : 251,
                        "optime" : Timestamp(1354037833000, 1),
                        "optimeDate" : ISODate("2012-11-27T17:37:13Z"),
                        "self" : true
                },
                {
                        "_id" : 2,
                        "name" : "redhatB.example.com:27020",
                        "health" : 1,
                        "state" : 2,
                        "stateStr" : "SECONDARY",
                        "uptime" : 143,
                        "optime" : Timestamp(1354037833000, 1),
                        "optimeDate" : ISODate("2012-11-27T17:37:13Z"),
                        "lastHeartbeat" : ISODate("2012-11-27T17:52:41Z"),
                        "pingMs" : 0,
                        "errmsg" : "syncing to: redhatB.example.com:27019"
                }
        ],
        "ok" : 1
}

   

备注:可见节点 "redhatB.example.com:27019" 转变成主节点了,此时异常的节点
        "redhatB.example.com:27018" 不可用,接下来测试,启动异常节点是否能够
        自动恢复。
      
 
--2.4 连接新主库,并新表一个集合;  

 [mongo@redhatB ~]$ mongo 127.0.0.1:27019
MongoDB shell version: 2.2.1
connecting to: 127.0.0.1:27019/test

rs0:PRIMARY> show collections;
system.indexes
test_1
test_2
test_3
things

rs0:PRIMARY> db.test_4.save({id:1});

rs0:PRIMARY> db.test_4.find();
{ "_id" : ObjectId("50b4fe1b15747af472f54831"), "id" : 1 }

      


--2.5 启动原来故障节点

 [mongo@redhatB data03]$ mongod -f /pgdata_xc/mongodb/data01/mongodb_27018.conf
forked process: 1692
all output going to: /pgdata_xc/mongodb/data01/mongo.log
child process started successfully, parent exiting
   备注:三个节点都启用日志模式(journal),否则恢复时会有异常。

--2.6 再次查看节点状态

 rs0:PRIMARY> rs.status();rs.status();
{
        "set" : "rs0",
        "date" : ISODate("2012-11-27T18:16:43Z"),
        "myState" : 1,
        "members" : [
                {
                        "_id" : 0,
                        "name" : "redhatB.example.com:27018",
                        "health" : 1,
                        "state" : 1,
                        "stateStr" : "PRIMARY",
                        "uptime" : 1321,
                        "optime" : Timestamp(1354038811000, 1),
                        "optimeDate" : ISODate("2012-11-27T17:53:31Z"),
                        "self" : true
                },
                {
                        "_id" : 1,
                        "name" : "redhatB.example.com:27019",
                        "health" : 1,
                        "state" : 2,
                        "stateStr" : "SECONDARY",
                        "uptime" : 1313,
                        "optime" : Timestamp(1354038811000, 1),
                        "optimeDate" : ISODate("2012-11-27T17:53:31Z"),
                        "lastHeartbeat" : ISODate("2012-11-27T18:16:42Z"),
                        "pingMs" : 1
                },
                {
                        "_id" : 2,
                        "name" : "redhatB.example.com:27020",
                        "health" : 1,
                        "state" : 2,
                        "stateStr" : "SECONDARY",
                        "uptime" : 1313,
                        "optime" : Timestamp(1354038811000, 1),
                        "optimeDate" : ISODate("2012-11-27T17:53:31Z"),
                        "lastHeartbeat" : ISODate("2012-11-27T18:16:42Z"),
                        "pingMs" : 0
                }
        ],
        "ok" : 1
}
   
  备注:节点 "name" : "redhatB.example.com:27018" 再次竞选成功。
 
 
--2.7 测试数据

 [mongo@redhatB ~]$ mongo 127.0.0.1:27018
MongoDB shell version: 2.2.1
connecting to: 127.0.0.1:27018/test
rs0:PRIMARY> show collections;
system.indexes
test_1
test_2
test_3
test_4
things

rs0:PRIMARY> db.test_4.find();
{ "_id" : ObjectId("50b4fe1b15747af472f54831"), "id" : 1 }

   备注:恢复后的节点可以查到新数据了,说明已自动恢复。


 

三 总结

      以上只演示三节点 Replica set 的 failover 的情况 ,其它情况并没演示;另外 failover 的关键点

为投票机制,这点还不是非常清楚,只知道设置节点的 priority  ,还需要查阅相关资料。关于奇数

节点和偶数节点 Primary 宕机后 SECONDARY 是否可以接管的问题,可以参考以下帖子:

http://www.itpub.net/thread-1740982-1-1.html

 

四 参考
http://docs.mongodb.org/manual/administration/replica-sets/
http://docs.mongodb.org/manual/core/replication/#replica-set-failover
  

  评论这张
 
阅读(21185)| 评论(0)
推荐 转载

历史上的今天

在LOFTER的更多文章

评论

<#--最新日志,群博日志--> <#--推荐日志--> <#--引用记录--> <#--博主推荐--> <#--随机阅读--> <#--首页推荐--> <#--历史上的今天--> <#--被推荐日志--> <#--上一篇,下一篇--> <#-- 热度 --> <#-- 网易新闻广告 --> <#--右边模块结构--> <#--评论模块结构--> <#--引用模块结构--> <#--博主发起的投票-->
 
 
 
 
 
 
 
 
 
 
 
 
 
 

页脚

网易公司版权所有 ©1997-2016