[SERVER-44051] getShardDistribution() does not report "Collection XYZ is not sharded" on dropped but previously sharded collections Created: 16/Oct/19  Updated: 29/Oct/23  Resolved: 14/Apr/20

Status: Closed
Project: Core Server
Component/s: Shell
Affects Version/s: None
Fix Version/s: 4.0.20, 4.2.8, 4.4.0-rc10, 4.2.9, 4.7.0

Type: Bug Priority: Major - P3
Reporter: James Kovacs Assignee: Kevin Pulo
Resolution: Fixed Votes: 0
Labels: sharding-wfbf-day
Remaining Estimate: Not Specified
Time Spent: Not Specified
Original Estimate: Not Specified

Issue Links:
Backports
Problem/Incident
Related
related to SERVER-43689 getShardDistribution() incorrectly sh... Closed
related to SERVER-44891 collStats will fail if resulting BSON... Closed
related to SERVER-44892 getShardDistribution should use $coll... Closed
Backwards Compatibility: Fully Compatible
Operating System: ALL
Backport Requested:
v4.4, v4.2, v4.0
Sprint: Sharding 2019-10-21, Sharding 2019-11-04, Sharding 2019-11-18, Sharding 2020-04-20
Participants:
Case:
Linked BF Score: 0

 Description   

Once a sharded collection has been successfully dropped, operations on it should be identical to if it had never existed, and to if it was to be implicitly re-created as a new unsharded collection. However, after a sharded collection has been dropped, db.coll.getShardDistribution() reports the error "Unable to retrieve storageStats in $collStats stage: Collection [test.test] not found." instead of the expected message "Collection test.test is not sharded." (which is the message obtained if the collection had never previously existed, or had previously existed but unsharded).


Original summary: getShardDistribution() erroneously reports "Collection XYZ is not sharded"
Original description:
If collStats exceeds the maxBSONSize of 16MB due to a lot of shard metadata, getShardDistribution() erroneously reports that the collection is not sharded. This is caused by the function not checking for an error in the collStats command:

> db.COLL.getShardDistribution
function () {
 
    var stats = this.stats();
 
    if (!stats.sharded) {
        print("Collection " + this + " is not sharded.");
        return;
    }
 
    ... logic to print shard stats ...

The stats.sharded property doesn't exist if stats contains an error message, which is why getShardDistribution() reports a collection as unsharded if its shard metadata causes the collStats command to fail.



 Comments   
Comment by Githook User [ 11/Jun/20 ]

Author:

{'name': 'Kevin Pulo', 'email': 'kevin.pulo@mongodb.com', 'username': 'devkev'}

Message: SERVER-44051 ensure getShardDistribution correctly checks for sharded collections

(cherry picked from commit f32f2f906f8c37145ed2bf64cd8db99d35671a41)
Branch: v4.2
https://github.com/mongodb/mongo/commit/bc1969c4befb4b695bd6e5ce61736c77e3bf3f71

Comment by Githook User [ 11/Jun/20 ]

Author:

{'name': 'Kevin Pulo', 'email': 'kevin.pulo@mongodb.com', 'username': 'devkev'}

Message: SERVER-44051 ensure getShardDistribution correctly checks for sharded collections

(cherry picked from commit f32f2f906f8c37145ed2bf64cd8db99d35671a41)
Branch: v4.0
https://github.com/mongodb/mongo/commit/930dff22da4d089866502ca0337dbe356ec96b3e

Comment by Githook User [ 11/Jun/20 ]

Author:

{'name': 'Kevin Pulo', 'email': 'kevin.pulo@mongodb.com', 'username': 'devkev'}

Message: SERVER-44051 ensure getShardDistribution correctly checks for sharded collections

(cherry picked from commit f32f2f906f8c37145ed2bf64cd8db99d35671a41)
Branch: v4.4
https://github.com/mongodb/mongo/commit/3abfe3f8aed2b6e15ecb7c80bc77db0449d5bea5

Comment by Githook User [ 14/Apr/20 ]

Author:

{'name': 'Kevin Pulo', 'email': 'kevin.pulo@mongodb.com', 'username': 'devkev'}

Message: SERVER-44051 ensure getShardDistribution correctly checks for sharded collections
Branch: master
https://github.com/mongodb/mongo/commit/f32f2f906f8c37145ed2bf64cd8db99d35671a41

Comment by Kevin Pulo [ 09/Apr/20 ]

This was almost completely handled by SERVER-44892. That change has a small bug which causes it to throw an error (instead of reporting "Collection is not sharded") on dropped sharded collections. (By contrast, dropped unsharded collections continue to correctly report "Collection is not sharded"). This is because _isSharded() is checking for an entry in config.collections, but not that the dropped field isn't true. So I will repurpose this ticket to fix that bug.

> sh.enableSharding("test")
{
        "ok" : 1,
        "operationTime" : Timestamp(1586409582, 6),
        "$clusterTime" : {
                "clusterTime" : Timestamp(1586409582, 6),
                "signature" : {
                        "hash" : BinData(0,"AAAAAAAAAAAAAAAAAAAAAAAAAAA="),
                        "keyId" : NumberLong(0)
                }
        }
}
> db.test._isSharded()
false
> db.test2._isSharded()
false
> db.createCollection("test")
{
        "ok" : 1,
        "operationTime" : Timestamp(1586409606, 1),
        "$clusterTime" : {
                "clusterTime" : Timestamp(1586409606, 1),
                "signature" : {
                        "hash" : BinData(0,"AAAAAAAAAAAAAAAAAAAAAAAAAAA="),
                        "keyId" : NumberLong(0)
                }
        }
}
> db.createCollection("test2")
{
        "ok" : 1,
        "operationTime" : Timestamp(1586409608, 1),
        "$clusterTime" : {
                "clusterTime" : Timestamp(1586409608, 1),
                "signature" : {
                        "hash" : BinData(0,"AAAAAAAAAAAAAAAAAAAAAAAAAAA="),
                        "keyId" : NumberLong(0)
                }
        }
}
> db.test._isSharded()
false
> db.test2._isSharded()
false
> db.test.getShardDistribution()
Collection test.test is not sharded.
> db.test2.getShardDistribution()
Collection test.test2 is not sharded.
> sh.shardCollection("test.test", {_id: 1})
{
        "collectionsharded" : "test.test",
        "collectionUUID" : UUID("59f3eb78-7e2c-4ed1-a9cd-0c039b47132b"),
        "ok" : 1,
        "operationTime" : Timestamp(1586409631, 8),
        "$clusterTime" : {
                "clusterTime" : Timestamp(1586409631, 8),
                "signature" : {
                        "hash" : BinData(0,"AAAAAAAAAAAAAAAAAAAAAAAAAAA="),
                        "keyId" : NumberLong(0)
                }
        }
}
> db.test._isSharded()
true
> db.test.drop()
true
> db.test._isSharded()
true
> db.test2.drop()
true
> db.test2._isSharded()
false
> db.test.getShardDistribution()
uncaught exception: Error: command failed: {
        "ok" : 0,
        "errmsg" : "Unable to retrieve storageStats in $collStats stage: Collection [test.test] not found.",
        "code" : 40280,
        "codeName" : "Location40280",
        "operationTime" : Timestamp(1586409676, 3),
        "$clusterTime" : {
                "clusterTime" : Timestamp(1586409690, 1),
                "signature" : {
                        "hash" : BinData(0,"AAAAAAAAAAAAAAAAAAAAAAAAAAA="),
                        "keyId" : NumberLong(0)
                }
        }
} : aggregate failed :
_getErrorWithCode@src/mongo/shell/utils.js:25:13
doassert@src/mongo/shell/assert.js:18:14
_assertCommandWorked@src/mongo/shell/assert.js:644:17
assert.commandWorked@src/mongo/shell/assert.js:734:16
DB.prototype._runAggregate@src/mongo/shell/db.js:266:5
DBCollection.prototype.aggregate@src/mongo/shell/collection.js:1012:12
DBCollection.prototype.getShardDistribution@src/mongo/shell/collection.js:1096:21
@(shell):1:1
> db.test2.getShardDistribution()
Collection test.test2 is not sharded.
> 

Comment by Randolph Tan [ 22/Oct/19 ]

collStats command is versioned, so shards should throw an error if it knows that it is in fact sharded when mongos thinks otherwise. However, because of SERVER-32198, the shard can mistakenly believe that a collection is unsharded.

Generated at Thu Feb 08 05:04:50 UTC 2024 using Jira 9.7.1#970001-sha1:2222b88b221c4928ef0de3161136cc90c8356a66.