[COMPASS-7143] Investigate changes in SERVER-76547: Create command on a time-series collection is not idempotent Created: 24/Aug/23  Updated: 24/Aug/23

Status: Needs Triage
Project: Compass
Component/s: None
Affects Version/s: None
Fix Version/s: None

Type: Investigation Priority: Minor - P4
Reporter: Backlog - Core Eng Program Management Team Assignee: Unassigned
Resolution: Unresolved Votes: 0
Labels: None
Remaining Estimate: Not Specified
Time Spent: Not Specified
Original Estimate: Not Specified

Issue Links:
Depends
depends on SERVER-76547 Create command on a time-series colle... Closed
Epic Link: STAR-4197

 Description   
Original Downstream Change Summary

We had previously done SERVER-60064 which made it so that collection creation is idempotent on mongod. That is, a create command reports success on if a collection/view with an identical namespace and options already exists. (It was already idempotent on mongos even before that ticket.)

However, that ticket missed the case of creating a time-series collection. This ticket (SERVER-76547) fixes that bug so that creating a time-series collection is now also idempotent, just like a non-time-series collection.

Description of Linked Ticket

This came out of investigating SERVER-73967, trying to remove NamespaceExists retry handling in the test infrastructure. The create command was made mostly idempotent in SERVER-60064, but missed time-series handling.

--------------------------------------------------------------------------

Relevant test failure details (notice that the original command included is a create for a time-series collection, and then the error is because there's a time-series view on a buckets collection – the error handling doesn't seem to realize it's time-series, not a view)

[js_test:timeseries_metric_index_compound] 2023-04-21T18:16:15.043Z assert: command failed: {
[js_test:timeseries_metric_index_compound] 	"ok" : 0,
[js_test:timeseries_metric_index_compound] 	"errmsg" : "namespace test.timeseries_metric_index_compound already exists, but is a view on test.system.buckets.timeseries_metric_index_compound rather than test",
[js_test:timeseries_metric_index_compound] 	"code" : 48,
[js_test:timeseries_metric_index_compound] 	"codeName" : "NamespaceExists",
[js_test:timeseries_metric_index_compound] 	"$clusterTime" : {
[js_test:timeseries_metric_index_compound] 		"clusterTime" : Timestamp(1682100974, 67),
[js_test:timeseries_metric_index_compound] 		"signature" : {
[js_test:timeseries_metric_index_compound] 			"hash" : BinData(0,"AAAAAAAAAAAAAAAAAAAAAAAAAAA="),
[js_test:timeseries_metric_index_compound] 			"keyId" : NumberLong(0)
[js_test:timeseries_metric_index_compound] 		}
[js_test:timeseries_metric_index_compound] 	},
[js_test:timeseries_metric_index_compound] 	"operationTime" : Timestamp(1682100974, 67)
[js_test:timeseries_metric_index_compound] } with original command request: {
[js_test:timeseries_metric_index_compound] 	"create" : "timeseries_metric_index_compound",
[js_test:timeseries_metric_index_compound] 	"timeseries" : {
[js_test:timeseries_metric_index_compound] 		"timeField" : "tm",
[js_test:timeseries_metric_index_compound] 		"metaField" : "mm"
[js_test:timeseries_metric_index_compound] 	},
[js_test:timeseries_metric_index_compound] 	"lsid" : {
[js_test:timeseries_metric_index_compound] 		"id" : UUID("4748b47a-6455-4017-a402-59816d1b6ffc")
[js_test:timeseries_metric_index_compound] 	},
[js_test:timeseries_metric_index_compound] 	"$clusterTime" : {
[js_test:timeseries_metric_index_compound] 		"clusterTime" : Timestamp(1682100974, 62),
[js_test:timeseries_metric_index_compound] 		"signature" : {
[js_test:timeseries_metric_index_compound] 			"hash" : BinData(0,"AAAAAAAAAAAAAAAAAAAAAAAAAAA="),
[js_test:timeseries_metric_index_compound] 			"keyId" : NumberLong(0)
[js_test:timeseries_metric_index_compound] 		}
[js_test:timeseries_metric_index_compound] 	},
[js_test:timeseries_metric_index_compound] 	"writeConcern" : {
[js_test:timeseries_metric_index_compound] 		"w" : "majority",
[js_test:timeseries_metric_index_compound] 		"wtimeout" : 300321
[js_test:timeseries_metric_index_compound] 	}
[js_test:timeseries_metric_index_compound] } on connection: connection to localhost:21000
[js_test:timeseries_metric_index_compound] _getErrorWithCode@src/mongo/shell/utils.js:24:13
[js_test:timeseries_metric_index_compound] doassert@src/mongo/shell/assert.js:18:14
[js_test:timeseries_metric_index_compound] _assertCommandWorked@src/mongo/shell/assert.js:766:25
[js_test:timeseries_metric_index_compound] assert.commandWorked@src/mongo/shell/assert.js:860:16
[js_test:timeseries_metric_index_compound] testBadIndex@jstests/core/timeseries/timeseries_metric_index_compound.js:179:16
[js_test:timeseries_metric_index_compound] @jstests/core/timeseries/timeseries_metric_index_compound.js:185:17
[js_test:timeseries_metric_index_compound] run@jstests/core/timeseries/libs/timeseries.js:203:15
[js_test:timeseries_metric_index_compound] @jstests/core/timeseries/timeseries_metric_index_compound.js:23:16
[js_test:timeseries_metric_index_compound] @jstests/core/timeseries/timeseries_metric_index_compound.js:206:2

This test failure is possible because the test ran in replica_sets_terminate_primary_jscore_passthrough, where the primary was stepped down causing an InterruptedDueToReplStateChange error and the create command is retried.

[js_test:timeseries_metric_index_compound] =-=-=-= Retrying write concern error response with retryable code :: create, CommandID: 472, error: {  "writeConcernError" : {  "code" : 11602,  "codeName" : "InterruptedDueToReplStateChange",  "errmsg" : "operation was interrupted",  "errInfo" : {  "writeConcern" : {  "w" : "majority",  "wtimeout" : 300321,  "provenance" : "clientSupplied" } } },  "ok" : 1,  "topologyVersion" : {  "processId" : ObjectId("6442d2d839878a7c9b4114a1"),  "counter" : NumberLong(7) },  "$clusterTime" : {  "clusterTime" : Timestamp(1682100974, 65),  "signature" : {  "hash" : BinData(0,"AAAAAAAAAAAAAAAAAAAAAAAAAAA="),  "keyId" : NumberLong(0) } },  "operationTime" : Timestamp(1682100974, 65) }, command: {  "create" : "timeseries_metric_index_compound",  "timeseries" : {  "timeField" : "tm",  "metaField" : "mm" },  "lsid" : {  "id" : UUID("4748b47a-6455-4017-a402-59816d1b6ffc") },  "$clusterTime" : {  "clusterTime" : Timestamp(1682100974, 62),  "signature" : {  "hash" : BinData(0,"AAAAAAAAAAAAAAAAAAAAAAAAAAA="),  "keyId" : NumberLong(0) } },  "writeConcern" : {  "w" : "majority",  "wtimeout" : 300321 } }

This is the test failure line. And this is the create command error.



 Comments   
Comment by PM Bot [ 24/Aug/23 ]

Fix Version updated for upstream SERVER-76547:
7.1.0-rc0

Generated at Wed Feb 07 22:45:31 UTC 2024 using Jira 9.7.1#970001-sha1:2222b88b221c4928ef0de3161136cc90c8356a66.