[SERVER-33276] Creation of already existing collection on a sharded cluster should be an error Created: 12/Feb/18  Updated: 27/Oct/23  Resolved: 12/Feb/18

Status: Closed
Project: Core Server
Component/s: Sharding
Affects Version/s: None
Fix Version/s: None

Type: Bug Priority: Minor - P4
Reporter: Jeffrey Yemin Assignee: [DO NOT USE] Backlog - Sharding Team
Resolution: Works as Designed Votes: 0
Labels: None
Remaining Estimate: Not Specified
Time Spent: Not Specified
Original Estimate: Not Specified

Issue Links:
Related
is related to SERVER-60064 Make create command idempotent on mongod Closed
is related to PYTHON-1936 Remove listCollections check from Dat... Closed
Assigned Teams:
Sharding
Operating System: ALL
Participants:

 Description   

For all cluster types (standalone, repl, sharded), attempting to create an already-existing collection should result in an error, as with this test against a 3.6 server:

mongos> db.runCommand({create : "foo"})
{
	"ok" : 1,
        ...
}
mongos> db.runCommand({create : "foo"})
{
	"ok" : 0,
	"errmsg" : "a collection 'test.foo' already exists",
	"code" : 48,
	"codeName" : "NamespaceExists",
       ...
}

But with a sharded cluster running on the nightly build it no longer does:

mongos> db.runCommand({create : "foo"})
{
	"ok" : 1,
        ...
}
mongos> db.runCommand({create : "foo"})
{
	"ok" : 1,
        ...
}

The behavioral difference was detected for the first time in this driver regression test. The git hash of the server in that test run is 3.7.1-280-g43fbd6a. The previous successful run of that test used a server with a git hash of 3.7.1-253-g2e1f172bc1.



 Comments   
Comment by Shane Harvey [ 05/Mar/20 ]

I see that the second create will succeed even if the collection is not empty. This could cause issues for applications that may be expecting collection creation to fail for a non-empty collection.

I also would worry about bugs caused by this subtle BC break. A successful create command used to mean that the collection was created and empty. It could be surprising for users to discover that the collection might not be empty on sharded clusters.

once a request in mongos loses the connection to config server or shard, it is ambiguous whether the collection that exists after it re-establishes connection was created by it or someone else. In other words, it can happen both ways: if we preserve the old behavior, it can return "collection already exists" error even though the request created it originally.

Could mongos create the collection in a transaction to fix this issue (and regain NamespaceExists error consistency with replica sets)?

Comment by Jeffrey Yemin [ 13/Feb/18 ]

Added the last comment as the description of SERVER-33297

Comment by Jeffrey Yemin [ 13/Feb/18 ]

Looks like there's more going on here than I thought. After disabling the first test, the next test suite caught another behavioral change. Here's a shell repro

MongoDB Enterprise > db.runCommand({create : "cappedColl", capped : true})
{
	"ok" : 1,
	"$clusterTime" : {
		"clusterTime" : Timestamp(1518546909, 1),
		"signature" : {
			"hash" : BinData(0,"AAAAAAAAAAAAAAAAAAAAAAAAAAA="),
			"keyId" : NumberLong(0)
		}
	},
	"operationTime" : Timestamp(1518546901, 2)
}
MongoDB Enterprise > db.getCollectionInfos()
[
	{
		"name" : "cappedColl",
		"type" : "collection",
		"options" : {
			"capped" : true,
			"size" : 0
		},
		"info" : {
			"readOnly" : false,
			"uuid" : UUID("ec90615e-a802-4529-a83f-e1e8163503a6")
		},
		"idIndex" : {
			"v" : 2,
			"key" : {
				"_id" : 1
			},
			"name" : "_id_",
			"ns" : "JavaDriverTest.cappedColl"
		}
	}
]

The current behavior in 3.6 (and the behavior of replica sets on master), is that creating a capped collection without a size specified is an error:

MongoDB Enterprise > db.runCommand({create : "cappedColl", capped : true})
{
	"ok" : 0,
	"errmsg" : "specify size:<n> when capped is true",
	"code" : 14832,
	"codeName" : "Location14832"
}

Comment by Randolph Tan [ 13/Feb/18 ]

jeff.yemin, once a request in mongos loses the connection to config server or shard, it is ambiguous whether the collection that exists after it re-establishes connection was created by it or someone else. In other words, it can happen both ways: if we preserve the old behavior, it can return "collection already exists" error even though the request created it originally.

alyson.cabral What do you think about this?

Comment by Jeffrey Yemin [ 12/Feb/18 ]

One more thought about this. I see that the second create will succeed even if the collection is not empty. This could cause issues for applications that may be expecting collection creation to fail for a non-empty collection.

Comment by Jeffrey Yemin [ 12/Feb/18 ]

Feel free to close this then, and I'll disable the driver test that caught the change.

Comment by Randolph Tan [ 12/Feb/18 ]

This is by design to be able to allow mongos to retry create command when a primary steps down and not throw an error to the user if the create succeeded. It should error though if you try to create a new namespace with different options of what exists.

Generated at Thu Feb 08 04:32:53 UTC 2024 using Jira 9.7.1#970001-sha1:2222b88b221c4928ef0de3161136cc90c8356a66.