|
Sorry for the delay in getting back to you. I've been looking into this issue, and I think that there are a few of potential issues here:
- I think (a) primary effect that we're seeing here is the same as
SERVER-18908. This issue, at core, of the workload on secondaries being less parallel than on primaries, and as a result secondaries lag behind the primary under certain kinds of workflows, particularly very heavy insert workloads. In these situations, if replication on all members stops making progress or is delayed simply behind the primary by the length of your wtimeout setting, the operations will timeout and report as "failed," even though they succeed on the primary and a "minority" of the set. Assuming your oplog has sufficient capacity, if you stop the operations secondaries should be able to catch up, without restarting, although restarting may have the same ultimate effect.
- Reviewing your replica set configuration, it looks like your set has 3 members with data, 1 arbiter, and 1 configured member that is offline. As a result "majority" is 3 nodes, which means that all writes have to propagate to all data nodes, any sort of delay in replication or networking errors will also cause write concerns to time out and fail.
- Any activity on the secondaries, may exacerbate the underlying problem (e.g. 1 above,) and make it easier for replication to fall behind and lead to failed write concerns.
I believe that the combination of these issues explains the effect that you're seeing. I'm going to go ahead and close this issue, and encourage you to track the progress of SERVER-18908. In the mean time you might also consider:
- attempting to limit the rate and size of insert operations, if/when your workload is very heavy and you're doing data loading operations.
- reconfigure your replica set to either: remove both the arbiter and missing member or replace the unreachable member of the set with an up to date member.
- upgrade, if you have not already, to 3.0.6 which is the latest stable release in the 3.0 series. There have been a large number of fixes that impact and improve WiredTiger and replication that may improve your experience in this issue.
I hope this helps, and thanks again for your patience.
Regards,
sam
|
|
Hello Sam,
Sorry for the late reply.
- Could you also include another copy of the rs.status() operation while you're experiencing this issue?
I have upload rs.status() and db.currentOp to MongoDB Support File Upload. Each log folder within the server name are indicative of a single incident
Example, mongodb-alpha-1/log1, mongodb-alpha-2/log1, mongodb-alpha-3/log1 is one incident
replica-lag
|
├── mongodb-alpha-1 (Secondary)
|
│ ├── log1
|
│ │ └── op.log
|
│ ├── log2
|
│ │ └── op.log
|
│ ├── log3
|
│ │ ├── op.log
|
│ │ └── rs.status.log
|
│ └── log4
|
│ ├── op2.log
|
│ ├── op.log
|
│ ├── rs2.status.log
|
│ └── rs.status.log
|
├── mongodb-alpha-2 (Secondary)
|
│ ├── log1
|
│ │ └── op.log
|
│ ├── log2
|
│ │ └── op.log
|
│ ├── log3
|
│ │ ├── op.log
|
│ │ └── rs.status.log
|
│ └── log4
|
│ ├── op2.log
|
│ ├── op.log
|
│ ├── rs2.status.log
|
│ └── rs.status.log
|
└── mongodb-alpha-3 (Primary)
|
├── log1
|
│ ├── op.log
|
│ └── rs.status.log
|
├── log2
|
│ ├── op.log
|
│ └── rs.status.log
|
├── log3
|
│ ├── op.log
|
│ └── rs.status.log
|
└── log4
|
├── op.log
|
└── rs.status.log
|
- I want to confirm that all members of this replica set are running MongoDB 3.0.5. Which storage engine are you using?
mongodb-alpha-1, mongodb-alpha-2, mongodb-alpha-3 and mongodb-alpha-arbiter is running MongoDB 3.0.5 and mongodb-alpha-backup is running MongoDB 3.0.4
- Which MongoDB Driver are you using, and at what version?
Nodejs, version 1.4.38
- Are you performing any operations directly on the secondary members of the set?
Yes, we do, but as you can see from the log there was nothing going on secondary servers
- I'm interested in any other potentially resource intensive operations that may be running concurrently. For example, building indexes, aggregation and map-reduce jobs, open cursors with the noTimeout option, may all consume resources and interfere with write concern responses.
Yes, there are a few aggregation queries that ran on secondary instances
- Are other operations slow while the write concern wait increases or are only
As far as I know only write concern wait increases
- Do write concerns eventually return, or do some operations get stuck and remain even after they reach the timeout?
The write concerns do return after the timeout, but that creates new problem where the application is prone to create duplicate data
FYI, we were experiencing same issue when we were running using 3.0.4, but I didn't report it because I thought this issue was related to following existing issues
|
|
Hello Sam,
Thank you for looking into this issue.
1. What write concern are you specifying on these operations? Could you provide an example operation with a write concern?
We use
{ w: "majority", wt : 15000}
write concerns for all of our operations, so it kind of not possible to list all the commands
2. Could you describe the configuration of your replica set? The output of rs.status() and rs.config() would be useful to this end. If your deployment is sharded, including the status of sh.status() would also be helpful here.
tawk-alpha:SECONDARY> rs.status()
|
{
|
"set" : "tawk-alpha",
|
"date" : ISODate("2015-07-31T10:03:13.795Z"),
|
"myState" : 2,
|
"syncingTo" : "mongodb-alpha-3:27017",
|
"members" : [
|
{
|
"_id" : 3,
|
"name" : "mongodb-alpha-backup:27017",
|
"health" : 1,
|
"state" : 2,
|
"stateStr" : "SECONDARY",
|
"uptime" : 171,
|
"optime" : Timestamp(1438336992, 220),
|
"optimeDate" : ISODate("2015-07-31T10:03:12Z"),
|
"lastHeartbeat" : ISODate("2015-07-31T10:03:12.331Z"),
|
"lastHeartbeatRecv" : ISODate("2015-07-31T10:03:11.897Z"),
|
"pingMs" : 0,
|
"syncingTo" : "mongodb-alpha-3:27017",
|
"configVersion" : 15
|
},
|
{
|
"_id" : 4,
|
"name" : "mongodb-alpha-arbiter:27017",
|
"health" : 1,
|
"state" : 7,
|
"stateStr" : "ARBITER",
|
"uptime" : 9071,
|
"lastHeartbeat" : ISODate("2015-07-31T10:03:12.271Z"),
|
"lastHeartbeatRecv" : ISODate("2015-07-31T10:03:11.813Z"),
|
"pingMs" : 0,
|
"configVersion" : 15
|
},
|
{
|
"_id" : 7,
|
"name" : "mongodb-alpha-2:27017",
|
"health" : 1,
|
"state" : 2,
|
"stateStr" : "SECONDARY",
|
"uptime" : 8963,
|
"optime" : Timestamp(1438336993, 634),
|
"optimeDate" : ISODate("2015-07-31T10:03:13Z"),
|
"lastHeartbeat" : ISODate("2015-07-31T10:03:13.654Z"),
|
"lastHeartbeatRecv" : ISODate("2015-07-31T10:03:13.286Z"),
|
"pingMs" : 0,
|
"syncingTo" : "mongodb-alpha-3:27017",
|
"configVersion" : 15
|
},
|
{
|
"_id" : 8,
|
"name" : "mongodb-alpha-1:27017",
|
"health" : 1,
|
"state" : 2,
|
"stateStr" : "SECONDARY",
|
"uptime" : 9071,
|
"optime" : Timestamp(1438336993, 767),
|
"optimeDate" : ISODate("2015-07-31T10:03:13Z"),
|
"syncingTo" : "mongodb-alpha-3:27017",
|
"configVersion" : 15,
|
"self" : true
|
},
|
{
|
"_id" : 9,
|
"name" : "mongodb-alpha-3:27017",
|
"health" : 1,
|
"state" : 1,
|
"stateStr" : "PRIMARY",
|
"uptime" : 9071,
|
"optime" : Timestamp(1438336993, 631),
|
"optimeDate" : ISODate("2015-07-31T10:03:13Z"),
|
"lastHeartbeat" : ISODate("2015-07-31T10:03:13.651Z"),
|
"lastHeartbeatRecv" : ISODate("2015-07-31T10:03:11.920Z"),
|
"pingMs" : 0,
|
"electionTime" : Timestamp(1438293267, 1),
|
"electionDate" : ISODate("2015-07-30T21:54:27Z"),
|
"configVersion" : 15
|
}
|
],
|
"ok" : 1
|
}
|
3. Does this affect all operations or only some operations. Can you give an example of another operation that doesn't wait for write concern?
It has to all of the operations, but some are more prominent because of high frequency
4. Could you provide the full output of the currentOp operation? The additional context may be useful.
Sure, I will grab it next time it happens and update this thread
Shankar
|