[SERVER-39332] Prevent test from dropping collections directly against a shard Created: 01/Feb/19  Updated: 29/Oct/23  Resolved: 25/Oct/19

Status: Closed
Project: Core Server
Component/s: Sharding
Affects Version/s: None
Fix Version/s: 4.3.1, 4.2.2

Type: Bug Priority: Major - P3
Reporter: Janna Golden Assignee: Blake Oler
Resolution: Fixed Votes: 0
Labels: sharding-wfbf-day
Remaining Estimate: Not Specified
Time Spent: Not Specified
Original Estimate: Not Specified

Issue Links:
Backports
Depends
Problem/Incident
Related
related to SERVER-41764 Shard can be removed while an ongoing... Closed
is related to SERVER-44886 Remove and re-add shard test wait tim... Closed
Backwards Compatibility: Fully Compatible
Operating System: ALL
Backport Requested:
v4.2
Sprint: Sharding 2019-10-21, Sharding 2019-11-04
Participants:
Linked BF Score: 19

 Description   

Problem Statement

Dropping a collection directly from a shard on a sharded cluster is not a supported action. If one drops a collection directly while a migration is happening for that collection, issues can arise. Such as the following:

If a collection/db is dropped from the recipient shard after it finishes cloning from the donor shard, it's possible that the migration still commits. The config will still think the recipient shard owns the chunk from the migration, but the shard will not know about the collection at all.

Fix

Change the linked test to drop the collection against the mongos instead of the shard.



 Comments   
Comment by Githook User [ 03/Dec/19 ]

Author:

{'email': 'william.schultz@mongodb.com', 'name': 'William Schultz', 'username': 'will62794'}

Message: SERVER-39332 Increase awaitReplicaSetMonitorTimeout sleep to 60 seconds
Branch: master
https://github.com/mongodb/mongo/commit/d3b08f2dc93636a04f92d5448fdacfd447704607

Comment by Githook User [ 28/Oct/19 ]

Author:

{'username': 'BlakeIsBlake', 'email': 'blake.oler@mongodb.com', 'name': 'Blake Oler'}

Message: SERVER-39332 Change reliance on balancer to use manual migrations in remove2.js (and make other touchups)

(cherry picked from commit 06e28da905b16eac09fa62b098d910a8f623b9ba)
Branch: v4.2
https://github.com/mongodb/mongo/commit/6202336a3a861960ba68e0fa64430aacaf65a27c

Comment by Githook User [ 25/Oct/19 ]

Author:

{'username': 'BlakeIsBlake', 'email': 'blake.oler@mongodb.com', 'name': 'Blake Oler'}

Message: SERVER-39332 Change reliance on balancer to use manual migrations in remove2.js (and make other touchups)
Branch: master
https://github.com/mongodb/mongo/commit/06e28da905b16eac09fa62b098d910a8f623b9ba

Comment by Blake Oler [ 29/Jul/19 ]

kaloian.manassiev The situation is outlined by janna.golden in the comments for BF-11779. The ability to remove a shard while a migration is happening is what allows this to happen.

Comment by Kaloian Manassiev [ 29/Jul/19 ]

From reading the test I believe that the balancer is on so that chunks can leave the shard, which is being removed. I am having difficulty seeing where does SERVER-41764 manifests here, so my first thought is that disabling the balancer and doing manual migrations will not help.

Comment by Blake Oler [ 01/Jul/19 ]

It seems like the issue here is actually the undefined interaction between removeShard and migrations – the related ticket SERVER-41764 supports this theory. Apart from that, is it necessary to have the balancer on during this test? If we did manual migrations, that would fix the issue here. It doesn't seem like the use of the balancer is adding any true value. Thoughts kaloian.manassiev?

Comment by Kaloian Manassiev [ 15/Feb/19 ]

Whoever picks that up, this test is doing unsupported action of dropping collections against the shard, so the test should be fixed instead.

Generated at Thu Feb 08 04:51:43 UTC 2024 using Jira 9.7.1#970001-sha1:2222b88b221c4928ef0de3161136cc90c8356a66.