[SERVER-42630] Updates from user-executed "applyOps" can fail in initial sync Created: 05/Aug/19 Updated: 29/Oct/23 Resolved: 13/Aug/19 |
|
| Status: | Closed |
| Project: | Core Server |
| Component/s: | Replication |
| Affects Version/s: | None |
| Fix Version/s: | 4.3.1 |
| Type: | Bug | Priority: | Major - P3 |
| Reporter: | A. Jesse Jiryu Davis | Assignee: | A. Jesse Jiryu Davis |
| Resolution: | Fixed | Votes: | 0 |
| Labels: | None | ||
| Remaining Estimate: | Not Specified | ||
| Time Spent: | Not Specified | ||
| Original Estimate: | Not Specified | ||
| Issue Links: |
|
||||||||||||||||
| Backwards Compatibility: | Fully Compatible | ||||||||||||||||
| Operating System: | ALL | ||||||||||||||||
| Sprint: | Repl 2019-08-12, Repl 2019-08-26 | ||||||||||||||||
| Participants: | |||||||||||||||||
| Linked BF Score: | 0 | ||||||||||||||||
| Description |
|
When a user executes applyOps with an update on the primary, this is treated as an upsert by default, therefore when the applyOps oplog entry is replayed on the secondary during initial sync it should also be treated as an upsert. However, it is not. Fetching missing documents masks this bug, for now (see |
| Comments |
| Comment by A. Jesse Jiryu Davis [ 13/Aug/19 ] | |||||||||||||||||||||||||||||||||||||||||||||||||||
|
Change streams deliberately ignore user-initiated applyOps commands, they only generate events from transactions' applyOps oplog entries. (Change streams require a txnNumber and lsid in an applyOps oplog entry.) Change streams even ignore the individual oplog entries that are generated when a user-initiated applyOps is executed non-atomically. Such entries have "fromMigrate: true", and change streams filter such entries out. | |||||||||||||||||||||||||||||||||||||||||||||||||||
| Comment by Githook User [ 13/Aug/19 ] | |||||||||||||||||||||||||||||||||||||||||||||||||||
|
Author: {'name': 'A. Jesse Jiryu Davis', 'email': 'jesse@mongodb.com', 'username': 'ajdavis'}Message: | |||||||||||||||||||||||||||||||||||||||||||||||||||
| Comment by Judah Schvimer [ 13/Aug/19 ] | |||||||||||||||||||||||||||||||||||||||||||||||||||
|
Another way we could fix the user initiated applyOps in change streams problem (if it is one) is to actually log the operations done by the applyOps in the oplog (as an atomic applyOps, but transformed to be accurate and not require upserting) like we do for transactions. I don't think this would have upgrade downgrade impact since it would be turning one valid oplog entry into another equivalent valid oplog entry. | |||||||||||||||||||||||||||||||||||||||||||||||||||
| Comment by Githook User [ 13/Aug/19 ] | |||||||||||||||||||||||||||||||||||||||||||||||||||
|
Author: {'name': 'Esha Maharishi', 'email': 'esha.maharishi@mongodb.com', 'username': 'EshaMaharishi'}Message: | |||||||||||||||||||||||||||||||||||||||||||||||||||
| Comment by Githook User [ 13/Aug/19 ] | |||||||||||||||||||||||||||||||||||||||||||||||||||
|
Author: {'name': 'A. Jesse Jiryu Davis', 'email': 'jesse@mongodb.com', 'username': 'ajdavis'}Message: During initial sync, a secondary replays the primary's oplog, including | |||||||||||||||||||||||||||||||||||||||||||||||||||
| Comment by A. Jesse Jiryu Davis [ 13/Aug/19 ] | |||||||||||||||||||||||||||||||||||||||||||||||||||
|
I'll check if we test user-initiated applyOps updates with change streams, or write such a test. | |||||||||||||||||||||||||||||||||||||||||||||||||||
| Comment by Charlie Swanson [ 13/Aug/19 ] | |||||||||||||||||||||||||||||||||||||||||||||||||||
|
judah.schvimer I don't know off the top of my head what it would do here. Should be simple enough to test though! | |||||||||||||||||||||||||||||||||||||||||||||||||||
| Comment by Judah Schvimer [ 13/Aug/19 ] | |||||||||||||||||||||||||||||||||||||||||||||||||||
|
charlie.swanson, do you know if change streams handles applyOps oplog entries correctly with respect to this bug? If not, it would be worth fixing that so downstream users don't hit this same problem. | |||||||||||||||||||||||||||||||||||||||||||||||||||
| Comment by Siyuan Zhou [ 06/Aug/19 ] | |||||||||||||||||||||||||||||||||||||||||||||||||||
|
jesse, I tried out your script on master. After the "applyOps" command, I got an oplog entry of applyOps command.
Secondary will call into applyOps command's code path to parse and execute the oplog entry, which will use the default specified in IDL as the same as the primary. If allowAtomic is set to false or atomic application failed, the operations will be applied individually, creating the right oplog entries.
It's not clear to me which behavior is a bug for initial sync. | |||||||||||||||||||||||||||||||||||||||||||||||||||
| Comment by A. Jesse Jiryu Davis [ 06/Aug/19 ] | |||||||||||||||||||||||||||||||||||||||||||||||||||
|
Thanks Eric. I have no intention of changing that default. | |||||||||||||||||||||||||||||||||||||||||||||||||||
| Comment by Eric Milkie [ 06/Aug/19 ] | |||||||||||||||||||||||||||||||||||||||||||||||||||
|
I think they are upserts by default because replication application used to do all updates as upserts by default. We eventually changed the latter without changing the former. | |||||||||||||||||||||||||||||||||||||||||||||||||||
| Comment by A. Jesse Jiryu Davis [ 06/Aug/19 ] | |||||||||||||||||||||||||||||||||||||||||||||||||||
|
I mean the latter: a user-initiated "applyOps" command. Interestingly, applyOps updates are upserts by default:
| |||||||||||||||||||||||||||||||||||||||||||||||||||
| Comment by Siyuan Zhou [ 06/Aug/19 ] | |||||||||||||||||||||||||||||||||||||||||||||||||||
|
By "applyOps" with upsert, do you mean an applied oplog entry with the "b" field or an "applyOps" command with some application mode that converts the update to an upsert? |