[SERVER-33879] config.transactions is not updated during startup replication recovery Created: 14/Mar/18  Updated: 29/Oct/23  Resolved: 04/Apr/18

Status: Closed
Project: Core Server
Component/s: Replication, Sharding
Affects Version/s: 3.6.3, 3.7.3
Fix Version/s: 3.6.5, 3.7.4

Type: Bug Priority: Major - P3
Reporter: Judah Schvimer Assignee: Randolph Tan
Resolution: Fixed Votes: 0
Labels: rollback-functional
Remaining Estimate: Not Specified
Time Spent: Not Specified
Original Estimate: Not Specified

Issue Links:
Backports
Depends
is depended on by SERVER-33287 Create passthrough that kills the pri... Closed
Related
related to SERVER-35654 rollback_transaction_table.js fails e... Closed
related to SERVER-33884 Add retryable writes to kill secondar... Closed
is related to SERVER-32334 Update startup/rollback recovery to u... Closed
is related to SERVER-34291 Disable transactions in periodic kill... Closed
Backwards Compatibility: Fully Compatible
Backport Requested:
v3.6
Sprint: Sharding 2018-04-09
Participants:
Linked BF Score: 56

 Description   

We call syncApply which does not internally call Session::addOpsForReplicatingTxnTable. This can lead to the transactions table missing entries if the server crashes before applying an entire batch.



 Comments   
Comment by Githook User [ 26/Apr/18 ]

Author:

{'email': 'randolph@10gen.com', 'username': 'renctan', 'name': 'Randolph Tan'}

Message: SERVER-33879 config.transactions is not updated during startup replication recovery

(cherry picked from commit 3cd5682db00ced94fb79fcec0a9ceca22c48f4d9)
Branch: v3.6
https://github.com/mongodb/mongo/commit/d313bab7553af2ee64c4c66ef6e61ed4f3cbcf8b

Comment by Githook User [ 04/Apr/18 ]

Author:

{'email': 'randolph@10gen.com', 'name': 'Randolph Tan', 'username': 'renctan'}

Message: SERVER-33879 config.transactions is not updated during startup replication recovery
Branch: master
https://github.com/mongodb/mongo/commit/3cd5682db00ced94fb79fcec0a9ceca22c48f4d9

Comment by Gregory McKeon (Inactive) [ 03/Apr/18 ]

renctan can you make this your next work item? It's blocking repl from testing recoverable rollback's correctness.

Comment by Judah Schvimer [ 14/Mar/18 ]

That suite does not kill nodes so replication recovery does not come into play. I filed SERVER-33884 with jack.mulrow and max.hirschhorn's input.

Comment by Randolph Tan [ 14/Mar/18 ]

judah.schvimer We have the retryable passthrough suite

Comment by Judah Schvimer [ 14/Mar/18 ]

Do we have retryable writes/transactions testing in jstests/core/? I'm trying to think about how to improve our test coverage of this such that the periodic kill secondaries passthrough would catch this.

Comment by Randolph Tan [ 14/Mar/18 ]

kaloian.manassiev I don't think so. The issue is the code is calling syncApply (apply single operation) and the logic for updating config.transactions before and after SERVER-32445 change is in multiApply (apply batches of operations from different ns).

Comment by Kaloian Manassiev [ 14/Mar/18 ]

renctan, could this have been introduced by your fix for SERVER-32445?

Comment by Judah Schvimer [ 14/Mar/18 ]

This was discovered while debugging recoverable rollback, which makes this far more likely to occur. This might be fixed by SERVER-32334, benety.goh.

Generated at Thu Feb 08 04:34:51 UTC 2024 using Jira 9.7.1#970001-sha1:2222b88b221c4928ef0de3161136cc90c8356a66.