[SERVER-4248] Replica Set gets out of sync Created: 10/Nov/11  Updated: 11/Jul/16  Resolved: 11/Nov/11

Status: Closed
Project: Core Server
Component/s: Replication
Affects Version/s: 2.0.1
Fix Version/s: None

Type: Bug Priority: Major - P3
Reporter: Ben Wyrosdick Assignee: Kristina Chodorow (Inactive)
Resolution: Done Votes: 0
Labels: oplog, replication
Remaining Estimate: Not Specified
Time Spent: Not Specified
Original Estimate: Not Specified
Environment:

Ubuntu 11.04, AWS m2.2xlarge


Operating System: Linux
Participants:

 Description   

We have seen an issue a few times on the same replica where the secondary is doing about 30-40k updates/sec and it has the state secondary still but can never catch up to master it just seems to stay the same amount behind master (normally just a few minutes).

The primary has about 10 or less updates/sec so it doesn't seem to be an issue immediately of it doing too much for secondary to keep up.

It has been fixed twice by stoping and starting the secondary and it will actually catch up after that normally.

Also, when it happens the db.printReplicationInfo() will show a very small window of oplog time (< 30 min). The weird thing about it is that as time goes on if you keep checking that status the window is constantly growing but the first event time stays the same until the oplog grows large enough. So it is almost as if the oplog gets reset.

PRIMARY> db.printReplicationInfo()
configured oplog size: 4096MB
log length start to end: 92276secs (25.63hrs)
oplog first event time: Wed Nov 09 2011 19:13:39 GMT+0000 (UTC)
oplog last event time: Thu Nov 10 2011 20:51:35 GMT+0000 (UTC)
now: Thu Nov 10 2011 20:51:35 GMT+0000 (UTC)

PRIMARY> db.printReplicationInfo()
configured oplog size: 4096MB
log length start to end: 92488secs (25.69hrs)
oplog first event time: Wed Nov 09 2011 19:13:39 GMT+0000 (UTC)
oplog last event time: Thu Nov 10 2011 20:55:07 GMT+0000 (UTC)
now: Thu Nov 10 2011 20:55:07 GMT+0000 (UTC)



 Comments   
Comment by Scott Hernandez (Inactive) [ 14/Nov/11 ]

That is expected. Every update is done by _id in the oplog and they are idempotent. You will get a fan-out like this for any multi-operation from the primary.

Comment by Ben Wyrosdick [ 14/Nov/11 ]

I think we see the op that causes it. It looks like maybe a single query is translated into thousands of ops in the oplog and causes it to get behind. Is this expected?

== Ruby Snippet ==
User.collection.update({}, {'$pullAll' => {unread_chats: [id]}},

{multi: true}

)

== Oplog Afterwards ==
PRIMARY> db.oplog.rs.find()
{ "ts" :

{ "t" : 1321230472000, "i" : 28384 }

, "h" : NumberLong("7367950549734090104"), "op" : "u", "ns" : "xxxx_production.users", "o2" :

{ "_id" : ObjectId("4e9521f6722f7f0006e4e1e4") }

, "o" : { "$pullAll" :

{ "unread_chats" : [ ObjectId("4ebfa07565ee19070b007eb6") ] }

} }
{ "ts" :

{ "t" : 1321230472000, "i" : 28385 }

, "h" : NumberLong("-4941163361012609057"), "op" : "u", "ns" : "xxxx_production.users", "o2" :

{ "_id" : ObjectId("4e9521f6722f7f0006e4e1e5") }

, "o" : { "$pullAll" :

{ "unread_chats" : [ ObjectId("4ebfa07565ee19070b007eb6") ] }

} }
{ "ts" :

{ "t" : 1321230472000, "i" : 28386 }

, "h" : NumberLong("-5476149061517939211"), "op" : "u", "ns" : "xxxx_production.users", "o2" :

{ "_id" : ObjectId("4e9521f6722f7f0006e4e1e6") }

, "o" : { "$pullAll" :

{ "unread_chats" : [ ObjectId("4ebfa07565ee19070b007eb6") ] }

} }
{ "ts" :

{ "t" : 1321230472000, "i" : 28387 }

, "h" : NumberLong("2149060704232662888"), "op" : "u", "ns" : "xxxx_production.users", "o2" :

{ "_id" : ObjectId("4e9521f6722f7f0006e4e1e7") }

, "o" : { "$pullAll" :

{ "unread_chats" : [ ObjectId("4ebfa07565ee19070b007eb6") ] }

} }
{ "ts" :

{ "t" : 1321230472000, "i" : 28388 }

, "h" : NumberLong("-5960082855673500958"), "op" : "u", "ns" : "xxxx_production.users", "o2" :

{ "_id" : ObjectId("4e9521f6722f7f0006e4e1e8") }

, "o" : { "$pullAll" :

{ "unread_chats" : [ ObjectId("4ebfa07565ee19070b007eb6") ] }

} }
{ "ts" :

{ "t" : 1321230472000, "i" : 28389 }

, "h" : NumberLong("-5660342605049353919"), "op" : "u", "ns" : "xxxx_production.users", "o2" :

{ "_id" : ObjectId("4e9521f6722f7f0006e4e1e9") }

, "o" : { "$pullAll" :

{ "unread_chats" : [ ObjectId("4ebfa07565ee19070b007eb6") ] }

} }
{ "ts" :

{ "t" : 1321230472000, "i" : 28390 }

, "h" : NumberLong("-2221591118617756225"), "op" : "u", "ns" : "xxxx_production.users", "o2" :

{ "_id" : ObjectId("4e9521f6722f7f0006e4e1ea") }

, "o" : { "$pullAll" :

{ "unread_chats" : [ ObjectId("4ebfa07565ee19070b007eb6") ] }

} }
{ "ts" :

{ "t" : 1321230472000, "i" : 28391 }

, "h" : NumberLong("479178575086387690"), "op" : "u", "ns" : "xxxx_production.users", "o2" :

{ "_id" : ObjectId("4e9521f6722f7f0006e4e1eb") }

, "o" : { "$pullAll" :

{ "unread_chats" : [ ObjectId("4ebfa07565ee19070b007eb6") ] }

} }
{ "ts" :

{ "t" : 1321230472000, "i" : 28392 }

, "h" : NumberLong("1454718424901059596"), "op" : "u", "ns" : "xxxx_production.users", "o2" :

{ "_id" : ObjectId("4e9521f6722f7f0006e4e1ec") }

, "o" : { "$pullAll" :

{ "unread_chats" : [ ObjectId("4ebfa07565ee19070b007eb6") ] }

} }
{ "ts" :

{ "t" : 1321230472000, "i" : 28393 }

, "h" : NumberLong("-2733836735551696413"), "op" : "u", "ns" : "xxxx_production.users", "o2" :

{ "_id" : ObjectId("4e9521f6722f7f0006e4e1ed") }

, "o" : { "$pullAll" :

{ "unread_chats" : [ ObjectId("4ebfa07565ee19070b007eb6") ] }

} }
{ "ts" :

{ "t" : 1321230472000, "i" : 28394 }

, "h" : NumberLong("3406322233193789257"), "op" : "u", "ns" : "xxxx_production.users", "o2" :

{ "_id" : ObjectId("4e9521f6722f7f0006e4e1ee") }

, "o" : { "$pullAll" :

{ "unread_chats" : [ ObjectId("4ebfa07565ee19070b007eb6") ] }

} }
{ "ts" :

{ "t" : 1321230472000, "i" : 28395 }

, "h" : NumberLong("8502987010612628908"), "op" : "u", "ns" : "xxxx_production.users", "o2" :

{ "_id" : ObjectId("4e9521f6722f7f0006e4e1ef") }

, "o" : { "$pullAll" :

{ "unread_chats" : [ ObjectId("4ebfa07565ee19070b007eb6") ] }

} }
{ "ts" :

{ "t" : 1321230472000, "i" : 28396 }

, "h" : NumberLong("-4418903082715263754"), "op" : "u", "ns" : "xxxx_production.users", "o2" :

{ "_id" : ObjectId("4e9521f6722f7f0006e4e1f0") }

, "o" : { "$pullAll" :

{ "unread_chats" : [ ObjectId("4ebfa07565ee19070b007eb6") ] }

} }
{ "ts" :

{ "t" : 1321230472000, "i" : 28397 }

, "h" : NumberLong("-4547385937031701051"), "op" : "u", "ns" : "xxxx_production.users", "o2" :

{ "_id" : ObjectId("4e9521f6722f7f0006e4e1f1") }

, "o" : { "$pullAll" :

{ "unread_chats" : [ ObjectId("4ebfa07565ee19070b007eb6") ] }

} }
{ "ts" :

{ "t" : 1321230472000, "i" : 28398 }

, "h" : NumberLong("4469202679615264403"), "op" : "u", "ns" : "xxxx_production.users", "o2" :

{ "_id" : ObjectId("4e9521f6722f7f0006e4e1f2") }

, "o" : { "$pullAll" :

{ "unread_chats" : [ ObjectId("4ebfa07565ee19070b007eb6") ] }

} }
{ "ts" :

{ "t" : 1321230472000, "i" : 28399 }

, "h" : NumberLong("-4092244317294379346"), "op" : "u", "ns" : "xxxx_production.users", "o2" :

{ "_id" : ObjectId("4e9521f6722f7f0006e4e1f3") }

, "o" : { "$pullAll" :

{ "unread_chats" : [ ObjectId("4ebfa07565ee19070b007eb6") ] }

} }
{ "ts" :

{ "t" : 1321230472000, "i" : 28400 }

, "h" : NumberLong("3498665780605362592"), "op" : "u", "ns" : "xxxx_production.users", "o2" :

{ "_id" : ObjectId("4e9521f6722f7f0006e4e1f4") }

, "o" : { "$pullAll" :

{ "unread_chats" : [ ObjectId("4ebfa07565ee19070b007eb6") ] }

} }
{ "ts" :

{ "t" : 1321230472000, "i" : 28401 }

, "h" : NumberLong("-7208861788328173337"), "op" : "u", "ns" : "xxxx_production.users", "o2" :

{ "_id" : ObjectId("4e9521f6722f7f0006e4e1f5") }

, "o" : { "$pullAll" :

{ "unread_chats" : [ ObjectId("4ebfa07565ee19070b007eb6") ] }

} }
{ "ts" :

{ "t" : 1321230472000, "i" : 28402 }

, "h" : NumberLong("-1232670496862447715"), "op" : "u", "ns" : "xxxx_production.users", "o2" :

{ "_id" : ObjectId("4e9521f6722f7f0006e4e1f6") }

, "o" : { "$pullAll" :

{ "unread_chats" : [ ObjectId("4ebfa07565ee19070b007eb6") ] }

} }
{ "ts" :

{ "t" : 1321230472000, "i" : 28403 }

, "h" : NumberLong("7642858452721797360"), "op" : "u", "ns" : "xxxx_production.users", "o2" :

{ "_id" : ObjectId("4e9521f6722f7f0006e4e1f7") }

, "o" : { "$pullAll" :

{ "unread_chats" : [ ObjectId("4ebfa07565ee19070b007eb6") ] }

} }

Comment by Eliot Horowitz (Inactive) [ 11/Nov/11 ]

resolving as main issue seems done

Comment by Eliot Horowitz (Inactive) [ 11/Nov/11 ]

Yes

use local
db.oplog.rs.find()

Comment by Ben Wyrosdick [ 11/Nov/11 ]

can I query the oplog to see what the queries look like easily?

Comment by Ben Wyrosdick [ 11/Nov/11 ]

yeah ... I agree after seeing it happen on staging. The thing that was throwing me off was that the oplog start time seemed to hang for hours and I didn't think that it was just so many writes at once that it didn't move.

I will increase the oplog size.

Comment by Eliot Horowitz (Inactive) [ 11/Nov/11 ]

Sounds like:

  • write storm happens
  • time in oplog gets very low
  • secondary falls behind for a while
  • then catches up and oplog gets long again?

If that's the case, seems like pretty normal behavior.

Comment by Ben Wyrosdick [ 11/Nov/11 ]

just to note ... the occurrence I posted about on staging acted a bit differently in that the first event in the oplog didn't stay the same like it had been on the production box, but having run the printReplicationInfo again on production just now it seems to have moved 4 seconds in the past few hours. This might possibly continue to point to some sort of huge burst in the oplog writes.

PRIMARY> db.printReplicationInfo()
configured oplog size: 4096MB
log length start to end: 107032secs (29.73hrs)
oplog first event time: Wed Nov 09 2011 19:13:43 GMT+0000 (UTC)
oplog last event time: Fri Nov 11 2011 00:57:35 GMT+0000 (UTC)
now: Fri Nov 11 2011 00:57:35 GMT+0000 (UTC)

Comment by Ben Wyrosdick [ 11/Nov/11 ]

It stays secondary ... it is actually acting up on one right now.

PRIMARY> db.printReplicationInfo()
configured oplog size: 4096MB
log length start to end: 1158secs (0.32hrs)
oplog first event time: Fri Nov 11 2011 00:04:33 GMT+0000 (UTC)
oplog last event time: Fri Nov 11 2011 00:23:51 GMT+0000 (UTC)
now: Fri Nov 11 2011 00:23:51 GMT+0000 (UTC)

SECONDARY> rs.status()
{
"set" : "xxxx",
"date" : ISODate("2011-11-11T00:19:25Z"),
"myState" : 2,
"syncingTo" : "xxxx.staging.member0.mongolayer.com:27017",
"members" : [
{
"_id" : 3,
"name" : "xxxx.staging.member0.mongolayer.com:27017",
"health" : 1,
"state" : 1,
"stateStr" : "PRIMARY",
"uptime" : 1360494,
"optime" :

{ "t" : 1320970851000, "i" : 19193 }

,
"optimeDate" : ISODate("2011-11-11T00:20:51Z"),
"lastHeartbeat" : ISODate("2011-11-11T00:19:24Z"),
"pingMs" : 1
},
{
"_id" : 4,
"name" : "xxxx.staging.member1.mongolayer.com:27017",
"health" : 1,
"state" : 2,
"stateStr" : "SECONDARY",
"optime" :

{ "t" : 1320970073000, "i" : 16237 }

,
"optimeDate" : ISODate("2011-11-11T00:07:53Z"),
"self" : true
},
{
"_id" : 5,
"name" : "xxxx.staging.arbiter.mongolayer.com:27017",
"health" : 1,
"state" : 7,
"stateStr" : "ARBITER",
"uptime" : 1360495,
"optime" :

{ "t" : 0, "i" : 0 }

,
"optimeDate" : ISODate("1970-01-01T00:00:00Z"),
"lastHeartbeat" : ISODate("2011-11-11T00:19:24Z"),
"pingMs" : 0
}
],
"ok" : 1
}

mongostat on primary:
connected to: 127.0.0.1
insert query update delete getmore command flushes mapped vsize res faults locked % idx miss % qr|qw ar|aw netIn netOut conn set repl time
0 0 3 0 0 6 0 12.4g 12.9g 6.67g 1 98.1 0 0|0 0|2 1k 3k 54 xxxx M 00:22:57
0 0 0 0 1 3 0 12.4g 12.9g 6.67g 2 91.7 0 0|0 0|2 324b 4m 54 xxxx M 00:22:58
0 0 4 0 0 6 0 12.4g 12.9g 6.67g 1 102 0 0|0 0|2 1k 3k 54 xxxx M 00:22:59
0 0 2 0 1 3 0 12.4g 12.9g 6.66g 0 96.3 0 0|0 0|2 565b 4m 54 xxxx M 00:23:00
0 0 1 0 0 3 0 12.4g 12.9g 6.66g 0 99.7 0 0|0 0|2 277b 3k 54 xxxx M 00:23:01
0 0 0 0 1 3 0 12.4g 12.9g 6.66g 0 93.5 0 0|1 1|2 277b 3k 54 xxxx M 00:23:02
0 0 2 0 0 4 0 12.4g 12.9g 6.65g 2 0.5 0 0|2 1|3 400b 3k 54 xxxx M 00:23:03
2 2 1 0 0 8 0 12.4g 12.9g 6.65g 0 5.4 0 0|1 1|2 1k 2k 54 xxxx M 00:23:04
0 0 0 0 0 3 0 12.4g 12.9g 6.65g 0 6.9 0 0|1 1|2 277b 3k 54 xxxx M 00:23:05
0 0 0 0 0 3 0 12.4g 12.9g 6.65g 0 86.3 0 0|0 0|2 324b 4m 54 xxxx M 00:23:06

mongostat on secondary:
connected to: 127.0.0.1
insert query update delete getmore command flushes mapped vsize res faults locked % idx miss % qr|qw ar|aw netIn netOut conn set repl time
*1 *0 *15565 *0 0 3|0 0 10.4g 10.9g 6.46g 0 97.8 0 0|0 0|0 277b 3k 31 xxxx SEC 00:20:40
*0 100 *8376 *0 0 3|0 0 10.4g 10.9g 6.45g 7 49.4 0 0|0 0|0 13k 33k 31 xxxx SEC 00:20:41
*2 *0 *14372 *0 0 3|0 0 10.4g 10.9g 6.46g 1 98 0 0|0 0|0 277b 3k 31 xxxx SEC 00:20:42
*0 *0 *12544 *0 0 3|0 0 10.4g 10.9g 6.46g 0 82.6 0 0|0 0|0 277b 3k 31 xxxx SEC 00:20:43
*0 100 *10449 *0 0 3|0 0 10.4g 10.9g 6.46g 7 61.5 0 0|0 0|0 13k 34k 31 xxxx SEC 00:20:44
*0 *0 *14510 *0 0 3|0 0 10.4g 10.9g 6.46g 0 97.9 0 0|0 0|0 277b 3k 31 xxxx SEC 00:20:45
*0 *0 *3186 *0 0 3|0 0 10.4g 10.9g 6.46g 0 22.1 0 0|0 0|0 277b 3k 31 xxxx SEC 00:20:46
*0 45 *11096 *0 0 3|0 0 10.4g 10.9g 6.46g 4 70.1 0 0|0 0|0 6k 17k 31 xxxx SEC 00:20:47
*0 42 *14068 *0 0 2|0 0 10.4g 10.9g 6.46g 2 91.2 0 0|0 0|0 5k 14k 31 xxxx SEC 00:20:48
*0 38 *2986 *0 0 3|0 0 10.4g 10.9g 6.46g 1 14.4 0 0|0 0|0 5k 15k 31 xxxx SEC 00:20:49

Client says there was nothing going on at the time ... this one is their staging box. Looks like it is resyncing ... like perhaps the oplog rolls over but they claim they are not making that many changes and you can see from the primary that it doesn't do much. Can we inspect the oplog and look for large blasts of updates easily?

While I was typing this out it got too stale:

PRIMARY> rs.status()
{
"set" : "xxxx",
"date" : ISODate("2011-11-11T00:41:20Z"),
"myState" : 1,
"syncingTo" : "xxxx.staging.member1.mongolayer.com:27017",
"members" : [
{
"_id" : 3,
"name" : "xxxx.staging.member0.mongolayer.com:27017",
"health" : 1,
"state" : 1,
"stateStr" : "PRIMARY",
"optime" :

{ "t" : 1320972080000, "i" : 5084 }

,
"optimeDate" : ISODate("2011-11-11T00:41:20Z"),
"self" : true
},
{
"_id" : 4,
"name" : "xxxx.staging.member1.mongolayer.com:27017",
"health" : 1,
"state" : 3,
"stateStr" : "RECOVERING",
"uptime" : 1362224,
"optime" :

{ "t" : 1320970605000, "i" : 1154 }

,
"optimeDate" : ISODate("2011-11-11T00:16:45Z"),
"lastHeartbeat" : ISODate("2011-11-11T00:41:19Z"),
"pingMs" : 0,
"errmsg" : "error RS102 too stale to catch up"
},
{
"_id" : 5,
"name" : "xxxx.staging.arbiter.mongolayer.com:27017",
"health" : 1,
"state" : 7,
"stateStr" : "ARBITER",
"uptime" : 1362224,
"optime" :

{ "t" : 0, "i" : 0 }

,
"optimeDate" : ISODate("1970-01-01T00:00:00Z"),
"lastHeartbeat" : ISODate("2011-11-11T00:41:20Z"),
"pingMs" : 0
}
],
"ok" : 1
}

So it looks like it might just be an issue of rolling the oplog ... even though updates look low.

Comment by Eliot Horowitz (Inactive) [ 10/Nov/11 ]

Does the secondary go into recovering or stay secondary throughout?

Generated at Thu Feb 08 03:05:25 UTC 2024 using Jira 9.7.1#970001-sha1:2222b88b221c4928ef0de3161136cc90c8356a66.