[SERVER-20731] Write commands can return writeConcernError even if nothing was written Created: 02/Oct/15  Updated: 02/Oct/15  Resolved: 02/Oct/15

Status: Closed
Project: Core Server
Component/s: Write Ops
Affects Version/s: 3.1.9
Fix Version/s: None

Type: Bug Priority: Major - P3
Reporter: Bernie Hackett Assignee: Bernie Hackett
Resolution: Done Votes: 0
Labels: None
Remaining Estimate: Not Specified
Time Spent: Not Specified
Original Estimate: Not Specified

Operating System: ALL
Participants:

 Description   

This appears to be a regression in the latest nightly. I've reproduced it here using master at 4372e7b826bbaff94d995c0a37d9a8bf29378e48. The following python repl session shows the problem with a three node replica set where the primary is running on port 27017:

>>> c = pymongo.MongoClient(port=27018)
>>> c.admin.command('configureFailPoint', 'rsSyncApplyStop', mode='alwaysOn')
{u'ok': 1.0}
>>> c = pymongo.MongoClient(port=27017)
>>> c.test.command('insert', 'test', documents=[{'_id': 1}], writeConcern={'w': 3, 'wtimeout': 1}, check=False)
{u'opTime': Timestamp(1443745018, 2), u'ok': 1, u'electionId': ObjectId('560dcb04ffffffffffffffff'), u'writeConcernError': {u'code': 64, u'errmsg': u'waiting for replication timed out', u'errInfo': {u'wtimeout': True}}, u'n': 1}
>>> c.test.command('insert', 'test', documents=[{'_id': 1}, {'_id': 2}], ordered=True, writeConcern={'w': 3, 'wtimeout': 1}, check=False)
{u'opTime': Timestamp(1443745018, 2), u'n': 0, u'electionId': ObjectId('560dcb04ffffffffffffffff'), u'writeErrors': [{u'index': 0, u'code': 11000, u'errmsg': u'E11000 duplicate key error collection: test.test index: _id_ dup key: { : 1 }'}], u'ok': 1, u'writeConcernError': {u'code': 64, u'errmsg': u'waiting for replication timed out', u'errInfo': {u'wtimeout': True}}}

Since the second insert command is ordered, and the first document causes a duplicate key error, nothing is written. Since nothing is written there should be no replication to time out.

Note that the replica set members must be started with --setParameter enableTestCommands=1



 Comments   
Comment by Bernie Hackett [ 02/Oct/15 ]

OK, after chatting with milkie I'm starting to understand why this behavior change makes sense. The second write operation has to read the _id index to make a decision about duplicate key error. In this case the _id that would be duplicated by the second write is still not fully committed, since we intentionally broke replication with the fail point, so the second write that relies on it also must return writeConcernError.

I'm going to adjust the test case in PyMongo. This new behavior is intentional.

Comment by Bernie Hackett [ 02/Oct/15 ]

In the case of an error, the write concern waits for the last successful write to replicate as specified.

That makes sense, but in this case both insert operations were done with 'w': 3. Why would each return writeConcernError when the second command writes no data and the first write already returned writeConcernError? Are you saying the second insert command is returning writeConcernError again for the first insert command because replication to both secondaries will never happen? If so, that seems wrong. If a replica set member gets into a bad state, every write command will start returning writeConcernError after the first failure, regardless of w value.

Comment by Eric Milkie [ 02/Oct/15 ]

In order to ensure that w:majority writes can be followed by read concern majority reads that reflect the same data, we made a change to the write commands such that they always attempt to honor write concern, even in the face of errors that result from pre-existing data. (for example, that means errors like duplicate-key-error, and not like parse errors)
In the case of an error, the write concern waits for the last successful write to replicate as specified. This guarantees that the data that triggered the error will have replicated and appeared in the read-committed snapshot, for writes that were specified as writeConcern:majority.

Comment by Bernie Hackett [ 02/Oct/15 ]

Disabling the fail point appears to fix the problem, so it would appear the regression is related to the fail point itself. Though it's also possible that without the fail point enabled the no op replication is succeeding in time to not trigger writeConcernError.

Generated at Thu Feb 08 03:55:06 UTC 2024 using Jira 9.7.1#970001-sha1:2222b88b221c4928ef0de3161136cc90c8356a66.