[SERVER-60064] Make create command idempotent on mongod Created: 17/Sep/21  Updated: 29/Oct/23  Resolved: 16/Feb/23

Status: Closed
Project: Core Server
Component/s: None
Affects Version/s: None
Fix Version/s: 7.0.0-rc0

Type: Bug Priority: Major - P3
Reporter: Shane Harvey Assignee: Kaitlin Mahar
Resolution: Fixed Votes: 0
Labels: None
Remaining Estimate: Not Specified
Time Spent: Not Specified
Original Estimate: Not Specified

Attachments: Text File SERVER-60064 Sharded.txt     Text File SERVER-60064 Standalone.txt    
Issue Links:
Backports
Depends
is depended on by SERVER-73967 Update handling of create command in ... Blocked
is depended on by PYTHON-1936 Remove listCollections check from Dat... Closed
is depended on by COMPASS-6526 Investigate changes in SERVER-60064: ... Closed
Documented
is documented by DOCS-15906 [SERVER] Create on an existing collec... Closed
Duplicate
is duplicated by SERVER-60933 Make the sharded cluster's 'create' c... Closed
Problem/Incident
causes SERVER-74330 Prevent nullptr access in checkCollec... Closed
Related
related to SERVER-32550 Drop of a non-existing collection on ... Closed
related to SERVER-73934 Remove commandWorkedOrFailedWithCode(... Closed
related to SERVER-33276 Creation of already existing collecti... Closed
is related to SERVER-76547 Create command on a time-series colle... Closed
is related to SERVER-74062 Remove test checks for DB version >= ... Closed
is related to SERVER-82074 Refactor collection creation idempote... Backlog
Assigned Teams:
Storage Execution
Backwards Compatibility: Minor Change
Backport Requested:
v5.0
Sprint: Execution Team 2021-11-01, Execution Team 2021-11-15, Sharding EMEA 2021-12-13, Sharding EMEA 2021-12-27, Sharding EMEA 2022-01-10, Execution Team 2023-02-20
Participants:
Linked BF Score: 155

 Description   

Previously, mongod would return an error if you attempted to create a collection or view that already existed. However, mongos would report success if the collection/view already existed with the same exact options you were attempting to create it with. The mongod behavior now matches the mongos behavior, i.e. on both mongod and mongos the create command is idempotent and will report success if an identical collection/view already exists, meaning it is safe to re-run the command even if it may have succeeded previously.

Original ticket description:

Related to SERVER-33276 and PYTHON-1936. Starting in MongoDB 4.0, the create command does not return an error when the collection already exists on sharded clusters:

>>> client.server_info()['version']
'4.2.3'
>>> client.is_mongos
True
>>> client.test.command('create', 'test', check=False)
{'ok': 1.0}
>>> client.test.command('create', 'test', check=False)
{'ok': 1.0}

On replica sets and standalones the second create fails with error code 48:

>>> client.server_info()['version']
'4.2.3'
>>> client.is_mongos
False
>>> client.test.command('create', 'test', check=False)
{'ok': 1.0}
>>> client.test.command('create', 'test', check=False)
{'ok': 0.0, 'errmsg': "a collection 'test.test' already exists", 'code': 48, 'codeName': 'NamespaceExists'}

We should make mongod have the same behavior as mongos for consistency across deployments.

Another benefit of this change is that it would allow drivers to retry the create command like mongos does.



 Comments   
Comment by Dianna Hohensee (Inactive) [ 26/Apr/23 ]

I've filed SERVER-76547, to make create command on time-series collections idempotent, as a result of investigating SERVER-73967 to remove NamespaceExists error handling in our test infrastructure.

I expect that the Drivers must be able to handle both NamespaceExists and the lack thereof from create commands, since they must be backwards compatible. So I don't think there's any further work for them because of this.

Comment by Eric Milkie [ 05/Apr/23 ]

Since this isn't particularly visible, I've copied the Downstream Changes field here in a comment:

The create command will now report success on mongod if a collection/view with an identical namespace and options already exists. Note that this was already the behavior on mongos as of MongoDB 4.0.

Comment by Githook User [ 14/Mar/23 ]

Author:

{'name': 'Kyle Kloberdanz', 'email': 'kyle.kloberdanz@mongodb.com', 'username': 'kkloberdanz'}

Message: Server update caused a break in waterfall (#934)

Recently, the server changed behavior where creating a collection that
already exists will not trigger an exception. Instead, replace this test
with a check that the collection was indeed created. See Jira ticket
below.

Jira: https://jira.mongodb.org/browse/SERVER-60064
Branch: releases/stable
https://github.com/mongodb/mongo-cxx-driver/commit/c2e6eb42c626620b43ea308db2e21e6888bfa94e

Comment by Githook User [ 07/Mar/23 ]

Author:

{'name': 'Kyle Kloberdanz', 'email': 'kyle.kloberdanz@mongodb.com', 'username': 'kkloberdanz'}

Message: Server update caused a break in waterfall (#934)

Recently, the server changed behavior where creating a collection that
already exists will not trigger an exception. Instead, replace this test
with a check that the collection was indeed created. See Jira ticket
below.

Jira: https://jira.mongodb.org/browse/SERVER-60064
Branch: releases/v3.7
https://github.com/mongodb/mongo-cxx-driver/commit/c2e6eb42c626620b43ea308db2e21e6888bfa94e

Comment by Githook User [ 22/Feb/23 ]

Author:

{'name': 'Kyle Kloberdanz', 'email': 'kyle.kloberdanz@mongodb.com', 'username': 'kkloberdanz'}

Message: Server update caused a break in waterfall (#934)

Recently, the server changed behavior where creating a collection that
already exists will not trigger an exception. Instead, replace this test
with a check that the collection was indeed created. See Jira ticket
below.

Jira: https://jira.mongodb.org/browse/SERVER-60064
Branch: master
https://github.com/mongodb/mongo-cxx-driver/commit/c1de68ec55722b8c32bb95fa0baafa68e4b49042

Comment by Githook User [ 16/Feb/23 ]

Author:

{'name': 'Kaitlin Mahar', 'email': 'kaitlin.mahar@mongodb.com', 'username': 'kmahar'}

Message: SERVER-60064 Make create command idempotent on mongod
Branch: master
https://github.com/mongodb/mongo/commit/ddf2bfb6a3e3b1a17723e77166690eb6f00ad36d

Comment by Tommaso Tocci [ 22/Jul/22 ]

michael.gargiulo@mongodb.com we have a lot chain of tickets that are blocked on this one. Is there any chance we can speed this up?

Comment by Cris Insignares Cuello [ 26/May/22 ]

michael.gargiulo@mongodb.com Do you have any updates on this ticket?

Comment by Kaloian Manassiev [ 14/Jan/22 ]

I agree with antonio.fuschetto's reasoning that we should make the create's behaviour in standalone/replicaset match that of sharding and not fail if the collection already exists (with the same options).

Since the change is on the standalone/replicaset side, I am moving this ticket to the Storage Execution's backlog.

Comment by Antonio Fuschetto [ 12/Jan/22 ]

I conducted some experiments to figure out what the current situation is, using different product versions and deployments. You can see the detailed results in the attached SERVER-60064 Standalone.txt and SERVER-60064 Sharded.txt files, but the current behavior can be summarized as follows:

Use cases Standalone / Replica set Sharded cluster
Recreation of a collection that already existing { "ok" : 0, "errmsg" : "Collection already exists. NS: mydb.coll1", ... } { "ok" : 1, ... }
Recreation of an index that already existing { "ok" : 1, "note" : "all indexes already exist", ... } { "ok" : 1, "raw.note" : "all indexes already exist", ... }

Additional notes:

  • The behavior of the command does not change whether the existing collection is empty or not.
  • The command fails (as expected) if the existing collection has been created with different options (i.e. {"ok" : 0, "errmsg" : "ns: db1.coll1 already exists with different options: {}", …}).

From my point of view, every command should be idempotent, meaning that it can be run several times without changing the final result, both in terms of changed data and results for the caller.

If the user has proper permissions, the parameters are syntactically and semantically correct, etc., the command should succeed even if the data is already as requested. This implies that the command should return “ok: 0” and, if really necessary, include a "note" field like "Collection already exists" (as today happens for the recreation of existing indexes).

That is, the behavior of the MongoS (starting with version 4.0) seems reasonable to me as in line with what I said above. I believe, instead, that we should change the behavior of MongoD, so that the users experience the same in case of standalone and replica set deployments.

Technical note: Idempotence is particularly relevant in distributed systems, since calling a certain API, the service may fail, or worse it may time out and not even send an error response. In these cases, if the service is idempotent, the client (user application or driver) can simply call it back any number of times without the fear that calling it multiple times will have negative effects.

Comment by Pavithra Vetriselvan [ 01/Nov/21 ]

Got it, then I'm going to assign this back to the Sharding Team since we don't want to change the default behavior in mongod.

max.hirschhorn Do you have any opinions on existing collection errors being guarded by a flag?

Comment by Shane Harvey [ 29/Oct/21 ]

If the collection is non-empty, does the create command still succeed for sharded clusters?

Yes. I agree that returning an error makes the most sense. The drivers team was confused when the sharded behavior changed back in 4.0 (see the discussion in SERVER-33276).

Comment by Pavithra Vetriselvan [ 29/Oct/21 ]

Execution believes that it is correct for mongod to return an error if the collection already exists.

If the collection is non-empty, does the create command still succeed for sharded clusters? That seems a little confusing to me, since I would expect the collection to be empty upon an explicit create command (vs implicit creation with an insert or something).

The flag doesn't seem like a bad option if it means that Drivers can opt into consistent behavior across sharded and unsharded clusters.

Comment by Shane Harvey [ 25/Oct/21 ]

The ask is to make the behavior consistent between mongod and mongos. In 3.6 and earlier, I believe mongos and mongod both returned an error on existing collections. Starting in 4.0 (SERVER-33276) mongos was changed to not return an error.

Comment by Pavithra Vetriselvan [ 22/Oct/21 ]

Is the ask here to make mongod not error if there is an existing collection with the same options? It feels like someone from Product should weigh in on this decision since I imagine it'd be a significant backwards breaking change. CC: michael.gargiulo Not sure if you have context here or know someone else who does?

Also, the create command is a part of API Version 1. Wouldn't we need to break API to introduce a different command response in either case (i.e. changing the mongos response vs changing the mongod response)? Even if we guarded this change with a flag like allowExisting, that flag would have to be unstable and can't be used with apiStrict: true.

Comment by Max Hirschhorn [ 08/Oct/21 ]

The Sharding NYC believes it is useful for mongos to not return an error. Sending this over to the Storage Execution team to see if we'd want to change the mongod behavior instead.

Generated at Thu Feb 08 05:48:52 UTC 2024 using Jira 9.7.1#970001-sha1:2222b88b221c4928ef0de3161136cc90c8356a66.