[SERVER-76547] Create command on a time-series collection is not idempotent Created: 26/Apr/23  Updated: 29/Oct/23  Resolved: 24/Aug/23

Status: Closed
Project: Core Server
Component/s: None
Affects Version/s: None
Fix Version/s: 7.1.0-rc0

Type: Bug Priority: Major - P3
Reporter: Dianna Hohensee (Inactive) Assignee: Gregory Noma
Resolution: Fixed Votes: 0
Labels: None
Remaining Estimate: Not Specified
Time Spent: Not Specified
Original Estimate: Not Specified

Issue Links:
Depends
is depended on by SERVER-73967 Update handling of create command in ... Blocked
is depended on by COMPASS-7143 Investigate changes in SERVER-76547: ... Needs Triage
Documented
is documented by DOCS-16345 [SERVER] Investigate changes in SERVE... Closed
Duplicate
is duplicated by SERVER-80107 Failovers can break time-series creation Closed
Problem/Incident
causes SERVER-80776 create fails on already-sharded time-... Closed
Related
related to SERVER-60064 Make create command idempotent on mongod Closed
is related to SERVER-80362 Always test for idempotency in time-s... Blocked
Assigned Teams:
Storage Execution
Backwards Compatibility: Minor Change
Sprint: Execution NAMR Team 2023-09-04
Participants:

 Description   

This came out of investigating SERVER-73967, trying to remove NamespaceExists retry handling in the test infrastructure. The create command was made mostly idempotent in SERVER-60064, but missed time-series handling.

--------------------------------------------------------------------------

Relevant test failure details (notice that the original command included is a create for a time-series collection, and then the error is because there's a time-series view on a buckets collection – the error handling doesn't seem to realize it's time-series, not a view)

[js_test:timeseries_metric_index_compound] 2023-04-21T18:16:15.043Z assert: command failed: {
[js_test:timeseries_metric_index_compound] 	"ok" : 0,
[js_test:timeseries_metric_index_compound] 	"errmsg" : "namespace test.timeseries_metric_index_compound already exists, but is a view on test.system.buckets.timeseries_metric_index_compound rather than test",
[js_test:timeseries_metric_index_compound] 	"code" : 48,
[js_test:timeseries_metric_index_compound] 	"codeName" : "NamespaceExists",
[js_test:timeseries_metric_index_compound] 	"$clusterTime" : {
[js_test:timeseries_metric_index_compound] 		"clusterTime" : Timestamp(1682100974, 67),
[js_test:timeseries_metric_index_compound] 		"signature" : {
[js_test:timeseries_metric_index_compound] 			"hash" : BinData(0,"AAAAAAAAAAAAAAAAAAAAAAAAAAA="),
[js_test:timeseries_metric_index_compound] 			"keyId" : NumberLong(0)
[js_test:timeseries_metric_index_compound] 		}
[js_test:timeseries_metric_index_compound] 	},
[js_test:timeseries_metric_index_compound] 	"operationTime" : Timestamp(1682100974, 67)
[js_test:timeseries_metric_index_compound] } with original command request: {
[js_test:timeseries_metric_index_compound] 	"create" : "timeseries_metric_index_compound",
[js_test:timeseries_metric_index_compound] 	"timeseries" : {
[js_test:timeseries_metric_index_compound] 		"timeField" : "tm",
[js_test:timeseries_metric_index_compound] 		"metaField" : "mm"
[js_test:timeseries_metric_index_compound] 	},
[js_test:timeseries_metric_index_compound] 	"lsid" : {
[js_test:timeseries_metric_index_compound] 		"id" : UUID("4748b47a-6455-4017-a402-59816d1b6ffc")
[js_test:timeseries_metric_index_compound] 	},
[js_test:timeseries_metric_index_compound] 	"$clusterTime" : {
[js_test:timeseries_metric_index_compound] 		"clusterTime" : Timestamp(1682100974, 62),
[js_test:timeseries_metric_index_compound] 		"signature" : {
[js_test:timeseries_metric_index_compound] 			"hash" : BinData(0,"AAAAAAAAAAAAAAAAAAAAAAAAAAA="),
[js_test:timeseries_metric_index_compound] 			"keyId" : NumberLong(0)
[js_test:timeseries_metric_index_compound] 		}
[js_test:timeseries_metric_index_compound] 	},
[js_test:timeseries_metric_index_compound] 	"writeConcern" : {
[js_test:timeseries_metric_index_compound] 		"w" : "majority",
[js_test:timeseries_metric_index_compound] 		"wtimeout" : 300321
[js_test:timeseries_metric_index_compound] 	}
[js_test:timeseries_metric_index_compound] } on connection: connection to localhost:21000
[js_test:timeseries_metric_index_compound] _getErrorWithCode@src/mongo/shell/utils.js:24:13
[js_test:timeseries_metric_index_compound] doassert@src/mongo/shell/assert.js:18:14
[js_test:timeseries_metric_index_compound] _assertCommandWorked@src/mongo/shell/assert.js:766:25
[js_test:timeseries_metric_index_compound] assert.commandWorked@src/mongo/shell/assert.js:860:16
[js_test:timeseries_metric_index_compound] testBadIndex@jstests/core/timeseries/timeseries_metric_index_compound.js:179:16
[js_test:timeseries_metric_index_compound] @jstests/core/timeseries/timeseries_metric_index_compound.js:185:17
[js_test:timeseries_metric_index_compound] run@jstests/core/timeseries/libs/timeseries.js:203:15
[js_test:timeseries_metric_index_compound] @jstests/core/timeseries/timeseries_metric_index_compound.js:23:16
[js_test:timeseries_metric_index_compound] @jstests/core/timeseries/timeseries_metric_index_compound.js:206:2

This test failure is possible because the test ran in replica_sets_terminate_primary_jscore_passthrough, where the primary was stepped down causing an InterruptedDueToReplStateChange error and the create command is retried.

[js_test:timeseries_metric_index_compound] =-=-=-= Retrying write concern error response with retryable code :: create, CommandID: 472, error: {  "writeConcernError" : {  "code" : 11602,  "codeName" : "InterruptedDueToReplStateChange",  "errmsg" : "operation was interrupted",  "errInfo" : {  "writeConcern" : {  "w" : "majority",  "wtimeout" : 300321,  "provenance" : "clientSupplied" } } },  "ok" : 1,  "topologyVersion" : {  "processId" : ObjectId("6442d2d839878a7c9b4114a1"),  "counter" : NumberLong(7) },  "$clusterTime" : {  "clusterTime" : Timestamp(1682100974, 65),  "signature" : {  "hash" : BinData(0,"AAAAAAAAAAAAAAAAAAAAAAAAAAA="),  "keyId" : NumberLong(0) } },  "operationTime" : Timestamp(1682100974, 65) }, command: {  "create" : "timeseries_metric_index_compound",  "timeseries" : {  "timeField" : "tm",  "metaField" : "mm" },  "lsid" : {  "id" : UUID("4748b47a-6455-4017-a402-59816d1b6ffc") },  "$clusterTime" : {  "clusterTime" : Timestamp(1682100974, 62),  "signature" : {  "hash" : BinData(0,"AAAAAAAAAAAAAAAAAAAAAAAAAAA="),  "keyId" : NumberLong(0) } },  "writeConcern" : {  "w" : "majority",  "wtimeout" : 300321 } }

This is the test failure line. And this is the create command error.



 Comments   
Comment by Githook User [ 24/Aug/23 ]

Author:

{'name': 'Gregory Noma', 'email': 'gregory.noma@gmail.com', 'username': 'gregorynoma'}

Message: SERVER-76547 Make time-series collection creation idempotent
Branch: master
https://github.com/mongodb/mongo/commit/6c5c6d426b87588ce46f6e536b7e5238e4372b22

Comment by Gregory Noma [ 21/Jun/23 ]

The error that gets returned in this case currently is coming from here. We'll need to update this function to handle the time-series case.

Generated at Thu Feb 08 06:32:58 UTC 2024 using Jira 9.7.1#970001-sha1:2222b88b221c4928ef0de3161136cc90c8356a66.