[SERVER-40321] Rolling back a prepared transaction on a capped collection leads to an invariant failure Created: 25/Mar/19 Updated: 29/Oct/23 Resolved: 17/Apr/19 |
|
| Status: | Closed |
| Project: | Core Server |
| Component/s: | Concurrency, Replication |
| Affects Version/s: | None |
| Fix Version/s: | 4.1.11 |
| Type: | Bug | Priority: | Major - P3 |
| Reporter: | Max Hirschhorn | Assignee: | Dianna Hohensee (Inactive) |
| Resolution: | Fixed | Votes: | 0 |
| Labels: | prepare_durability, rbfz, txn_storage | ||
| Remaining Estimate: | Not Specified | ||
| Time Spent: | Not Specified | ||
| Original Estimate: | Not Specified | ||
| Issue Links: |
|
||||||||||||||||||||||||||||||
| Backwards Compatibility: | Fully Compatible | ||||||||||||||||||||||||||||||
| Operating System: | ALL | ||||||||||||||||||||||||||||||
| Steps To Reproduce: |
|
||||||||||||||||||||||||||||||
| Sprint: | Storage NYC 2019-04-08, Storage NYC 2019-04-22 | ||||||||||||||||||||||||||||||
| Participants: | |||||||||||||||||||||||||||||||
| Linked BF Score: | 0 | ||||||||||||||||||||||||||||||
| Description |
|
Inserting a document into a capped collection acquires a RESOURCE_METADATA lock.
However, the invariant in LockerImpl::saveLockStateAndUnlock() claims that resource metadata locks never need to be saved.
|
| Comments |
| Comment by Githook User [ 17/Apr/19 ] |
|
Author: {'name': 'Dianna', 'username': 'DiannaHohensee', 'email': 'dianna.hohensee@10gen.com'}Message: |
| Comment by Dianna Hohensee (Inactive) [ 05/Apr/19 ] |
|
Okay, checked with Eric for any perf concerns about doing this on the write path, and there are none. It shall be as requested. |
| Comment by Judah Schvimer [ 04/Apr/19 ] |
|
alyson.cabral would prefer this check happen as early as possible, and Collection::isCapped already exists and cannot change unlike temp. We also want this to be prohibited for all transactions on shard-servers, not just ones that go through prepare. |
| Comment by Dianna Hohensee (Inactive) [ 04/Apr/19 ] |
|
geert.bosch, do we want to advocate for checking at prepare time, like we did in |
| Comment by Judah Schvimer [ 02/Apr/19 ] |
|
After discussing with alyson.cabral, tess.avitabile, and kaloian.manassiev, we will error on the first statement in a mongodb transaction that touches a capped collection on shard-servers only (so any transaction that could be prepared). Replica sets that aren't shard-servers will still be allowed to touch capped collections. |
| Comment by Judah Schvimer [ 01/Apr/19 ] |
|
If alyson.cabral is ok with it, we'll make the behavior: mongos does nothing, mongod returns an OperationNotSupportedInTransaction error when a statement touches a capped collection, the transaction may implicitly be aborted by the replica set depending on if the implementation requires it. This prohibition will apply to shardservers and replica sets that are not part of a sharded cluster alike. This will make it a slight behavior change from 4.0 single replica set transactions which did allow transactions to touch capped collections. |
| Comment by Kaloian Manassiev [ 01/Apr/19 ] |
|
It is correct that with 2PC commit optimizations a transaction might not become 2PC until after it has accessed a capped collection (or a collection has become capped). I am fine with disabling transactions on capped collections and would rather do it not just for sharding, but for replica sets as well. |
| Comment by Andy Schwerin [ 01/Apr/19 ] |
|
I think it makes sense to fail transactions that touch capped collections. I'd be willing to do it universally, since as Geert points out, the sequential ordering requirement on capped collections makes using them with transactions hazardous. I'd also be willing to just do it in all transactions on sharded clusters – if a shard server detects a transaction accessing a capped collection, it could abort the transaction immediately. Not really worth waiting to see if it might become two-phase, and it's not detectable a priori. You sometimes don't know until commit, or at least until the second write, and that seems like a bad place to report the error. |
| Comment by Judah Schvimer [ 01/Apr/19 ] |
|
kaloian.manassiev, would the above be possible? We could alternatively fail any mongodb transaction that tried to touch a capped collection as soon as it attempted to do so, not just cross-shard ones. We could also do this only for single replica set transactions on shardservers to leave non sharded behavior unchanged. If we only want to prevent cross-shard transactions touching capped collections we cannot do that until prepare time on mongod (though could potentially on mongos earlier). |
| Comment by Alyson Cabral (Inactive) [ 01/Apr/19 ] |
|
Yes, this is ok behavior. Is it possible to make this even more obvious to end users by not allowing them to start a transaction on a capped collection through a mongos? I try to avoid things that could work in dev and break as soon as someone pushes to production, just due to the placement of chunks. I prefer stricter failing if it is ultimately more obvious to catch. CC:kay.kim |
| Comment by Judah Schvimer [ 29/Mar/19 ] |
|
This lock yielding was added in
geert.bosch's third point makes it clear to me though that the above invariant relaxation won't be sufficient with the guarantees prepared transactions expect to provide. I think banning capped collections in prepared transactions similarly to |
| Comment by Geert Bosch [ 28/Mar/19 ] |
|
I think that cross-shard transactions involving capped collections (other than the oplog) is problematic. There are few reasons:
Without a significant redesign of capped collections, we cannot support cross-shard operations on them. I think any effort to make them somewhat work should instead be spent on a workable replacement in the form of improvements to TTL collections. We should just add an extra check at prepare time to not allow capped collections. |