-
Type: Bug
-
Resolution: Done
-
Priority: Critical - P2
-
Affects Version/s: 3.0.1
-
Component/s: Replication, WiredTiger
-
None
-
Fully Compatible
-
ALL
-
-
RPL 1 04/03/15, RPL 2 04/24/15
ISSUE SUMMARY
During replication and/or initial sync, when using the WiredTiger storage engine, a replica set member may terminate with a fatal assertion about a WriteConflictException. This assertion shuts down the server, causing replication or initial sync to fail.
This error is not common, but may be dependent on the workload of the application or user data.
USER IMPACT
A replica set may terminate during replication and/or initial sync with a fatal assertion, which requires the member to be restarted. In certain low availability configurations, this issue may affect the ability for the replica set to maintain a Primary member to take writes.
WORKAROUNDS
N/A
AFFECTED VERSIONS
MongoDB 3.0.0 through 3.0.4.
FIX VERSION
The fix is included in the 3.0.5 production release.
RESOLUTION DETAILS
By handling WriteConflictExceptions during the applyOps stage of replication, including during initial sync the system will now be able to retry until the WiredTiger operation completes successfully.
Original description
After successful replication the collections and indexes from MMAPv1 to Wiredtiger storage engine on our replicaset containing 2 servers the server crashes with the following output:
2015-03-23T12:49:36.500+0000 I INDEX [rsSync] build index done. scanned 31 total records. 0 secs 2015-03-23T12:49:36.502+0000 I REPL [rsSync] initial sync data copy, starting syncup 2015-03-23T12:49:36.527+0000 I REPL [rsSync] oplog sync 1 of 3 2015-03-23T12:49:37.106+0000 I REPL [ReplicationExecutor] syncing from: primary:27017 2015-03-23T12:49:37.111+0000 I REPL [SyncSourceFeedback] replset setting syncSourceFeedback to primary:27017 2015-03-23T12:50:10.804+0000 I REPL [repl writer worker 14] replication update of non-mod failed: { ts: Timestamp 1427114721000|133, h: -8520449917638273792, v: 2, op: "u", ns: "...REMOVED...", o2: { ...REMOVED... } } 2015-03-23T12:50:10.807+0000 I REPL [repl writer worker 14] replication info adding missing object 2015-03-23T12:50:10.882+0000 E REPL [repl writer worker 14] writer worker caught exception: :: caused by :: 112 WriteConflict on: { ts: Timestamp 1427114721000|133, h: -8520449917638273792, v: 2, op: "u", ns: "...REMOVED...", o2: { ...REMOVED... } 2015-03-23T12:50:10.883+0000 I - [repl writer worker 14] Fatal Assertion 16361 2015-03-23T12:50:10.883+0000 I - [repl writer worker 14] ***aborting after fassert() failure
- is duplicated by
-
SERVER-19518 mongod crashes with WriteConflictException when writing oplog on secondaries
- Closed
- is related to
-
SERVER-16994 Handle WriteConflictException when writing oplog on secondaries
- Closed