[SERVER-20731] Write commands can return writeConcernError even if nothing was written Created: 02/Oct/15 Updated: 02/Oct/15 Resolved: 02/Oct/15 |
|
| Status: | Closed |
| Project: | Core Server |
| Component/s: | Write Ops |
| Affects Version/s: | 3.1.9 |
| Fix Version/s: | None |
| Type: | Bug | Priority: | Major - P3 |
| Reporter: | Bernie Hackett | Assignee: | Bernie Hackett |
| Resolution: | Done | Votes: | 0 |
| Labels: | None | ||
| Remaining Estimate: | Not Specified | ||
| Time Spent: | Not Specified | ||
| Original Estimate: | Not Specified | ||
| Operating System: | ALL |
| Participants: |
| Description |
|
This appears to be a regression in the latest nightly. I've reproduced it here using master at 4372e7b826bbaff94d995c0a37d9a8bf29378e48. The following python repl session shows the problem with a three node replica set where the primary is running on port 27017:
Since the second insert command is ordered, and the first document causes a duplicate key error, nothing is written. Since nothing is written there should be no replication to time out. Note that the replica set members must be started with --setParameter enableTestCommands=1 |
| Comments |
| Comment by Bernie Hackett [ 02/Oct/15 ] |
|
OK, after chatting with milkie I'm starting to understand why this behavior change makes sense. The second write operation has to read the _id index to make a decision about duplicate key error. In this case the _id that would be duplicated by the second write is still not fully committed, since we intentionally broke replication with the fail point, so the second write that relies on it also must return writeConcernError. I'm going to adjust the test case in PyMongo. This new behavior is intentional. |
| Comment by Bernie Hackett [ 02/Oct/15 ] |
That makes sense, but in this case both insert operations were done with 'w': 3. Why would each return writeConcernError when the second command writes no data and the first write already returned writeConcernError? Are you saying the second insert command is returning writeConcernError again for the first insert command because replication to both secondaries will never happen? If so, that seems wrong. If a replica set member gets into a bad state, every write command will start returning writeConcernError after the first failure, regardless of w value. |
| Comment by Eric Milkie [ 02/Oct/15 ] |
|
In order to ensure that w:majority writes can be followed by read concern majority reads that reflect the same data, we made a change to the write commands such that they always attempt to honor write concern, even in the face of errors that result from pre-existing data. (for example, that means errors like duplicate-key-error, and not like parse errors) |
| Comment by Bernie Hackett [ 02/Oct/15 ] |
|
Disabling the fail point appears to fix the problem, so it would appear the regression is related to the fail point itself. Though it's also possible that without the fail point enabled the no op replication is succeeding in time to not trigger writeConcernError. |