[SERVER-43198] Zombie writes from failing $merge should not be able to re-create a collection Created: 06/Sep/19  Updated: 08/Sep/23

Status: Backlog
Project: Core Server
Component/s: Aggregation Framework, Sharding
Affects Version/s: 4.2.0
Fix Version/s: None

Type: Bug Priority: Minor - P4
Reporter: Charlie Swanson Assignee: Backlog - Query Execution
Resolution: Unresolved Votes: 0
Labels: neweng
Remaining Estimate: Not Specified
Time Spent: Not Specified
Original Estimate: Not Specified

Attachments: File attempted_reproduction.js     File zombie_killer.patch    
Issue Links:
Depends
depends on SERVER-44252 Delete implicit collection creation l... Closed
Related
related to SERVER-38852 Failing $merge can leave zombie write... Backlog
is related to SERVER-80853 $out on secondary node can produce in... In Code Review
is related to SERVER-42430 Create whitelist of namespaces that a... Closed
Assigned Teams:
Query Execution
Operating System: ALL
Sprint: Query 2019-09-23, Query 2019-10-07, Query 2019-10-21, Query 2019-11-04, QO 2022-08-22, QO 2022-09-05, QO 2022-09-19, QO 2022-10-03, QE 2022-10-31, QE 2022-11-14, QE 2022-11-28, QE 2022-12-12
Participants:

 Description   

SERVER-38852 describes a general problem of lingering operations which is probably much harder to solve. While investigating that we realized that $merge's writes are actually allowing an implicit creation of a collection. This may have been intentional to get the auto-create behavior for $merge, but it would be more robust to have the collection explicitly created, and that should be achievable.



 Comments   
Comment by Charlie Swanson [ 22/Dec/22 ]

I'm sending this back to the backlog after failing to make time to work on this for several sprints in a row. If this is important I think we should schedule it again. I don't see this as very important personally if it's not actively causing a BF and I can no longer reproduce it. Setting to P4 accordingly.

Comment by Charlie Swanson [ 18/Aug/22 ]

I can no longer use the attached script to reproduce this. I'm guessing it's still an issue, but will need a more sophisticated approach to fix it.

Comment by Kyle Suarez [ 21/Jun/22 ]

Is this unblocked now?

Comment by Charlie Swanson [ 08/Nov/19 ]

Similar to SERVER-43851, we're postponing this work and waiting for SERVER-44252 to land.

Comment by Charlie Swanson [ 06/Sep/19 ]

Thanks janna.golden

Comment by Janna Golden [ 06/Sep/19 ]

Yeah, we changed BatchedCommandRequest's _allowImplicitCollectionCreation to default to false as a part of SERVER-42430 which would affect $merge since it uses the ClusterWriter. We didn't backport this though.

Comment by Charlie Swanson [ 06/Sep/19 ]

Adding this to a sprint just so I don't forget about it. It's historically been a BF Friday sort of task but that may change.

Comment by Charlie Swanson [ 06/Sep/19 ]

Copied over the patch that I think will fix this and the supposed reproducer that I can no longer get to work from SERVER-38852 (even on that version of the code base).

janna.golden this is related to some of your recent work to disallow implicit collection creation, correct? Do you know which ticket that was? I recall that this is no longer really a bug but it will incur some bad performance characteristics that can be avoided. This is probably still a bug on 4.2 so I'll add that as an affectedVersion.

Generated at Thu Feb 08 05:02:31 UTC 2024 using Jira 9.7.1#970001-sha1:2222b88b221c4928ef0de3161136cc90c8356a66.