[SERVER-36786] mongod and mongos result format for collMod differs Created: 21/Aug/18  Updated: 27/Oct/23  Resolved: 14/Sep/18

Status: Closed
Project: Core Server
Component/s: Sharding
Affects Version/s: 4.0.1
Fix Version/s: None

Type: Bug Priority: Major - P3
Reporter: Derick Rethans Assignee: [DO NOT USE] Backlog - Sharding Team
Resolution: Works as Designed Votes: 0
Labels: None
Remaining Estimate: Not Specified
Time Spent: Not Specified
Original Estimate: Not Specified

Issue Links:
Related
related to DOCS-12094 Update documentation for commands wit... Closed
is related to PHPLIB-384 modifyCollection return format with s... Closed
Assigned Teams:
Sharding
Operating System: ALL
Participants:

 Description   

The documentation at https://docs.mongodb.com/manual/reference/command/collMod/#change-expiration-value-for-indexes shows the result document as:

{ "expireAfterSeconds_old" : 1800, "expireAfterSeconds_new" : 3600, "ok" : 1 }

We expect this format for the https://docs.mongodb.com/php-library/master/reference/method/MongoDBDatabase-modifyCollection/ method. However, we are adding additional topologies for our tests, and noticed that with a sharded cluster, the output format changes to:

{
	"raw" : {
		"localhost:4100" : {
			"ok" : 0,
			"errmsg" : "ns does not exist",
			"code" : 26,
			"codeName" : "NamespaceNotFound"
		},
		"localhost:4200" : {
			"expireAfterSeconds_old" : NumberLong(500),
			"expireAfterSeconds_new" : 700,
			"ok" : 1
		}
 
	},
	"ok" : 1,
	"operationTime" : Timestamp(1534859095, 4),
	"$clusterTime" : {
		"clusterTime" : Timestamp(1534859095, 4),
		"signature" : {
			"hash" : BinData(0,"AAAAAAAAAAAAAAAAAAAAAAAAAAA="),
			"keyId" : NumberLong(0)
		}
	}
}

This makes our test fail.

I would argue that the outputs should always be the same, or documented in some form.



 Comments   
Comment by Esha Maharishi (Inactive) [ 28/Sep/18 ]

ravind.kumar, yes, and it is also the format of any commands that use the 'appendRawResponses()' helper in the C++ code.

A quick grep on any branch can show the commands that used it on that version

On master, it looks like it's these:

$ git grep appendRawResponses src/mongo
src/mongo/s/cluster_commands_helpers.cpp:bool appendRawResponses(OperationContext* opCtx,
src/mongo/s/cluster_commands_helpers.h:bool appendRawResponses(OperationContext* opCtx,
src/mongo/s/commands/cluster_abort_transaction_cmd.cpp:        return appendRawResponses(opCtx, &errMsg, &result, response);
src/mongo/s/commands/cluster_collection_mod_cmd.cpp:        return appendRawResponses(
src/mongo/s/commands/cluster_create_indexes_cmd.cpp:        return appendRawResponses(opCtx,
src/mongo/s/commands/cluster_db_stats_cmd.cpp:        if (!appendRawResponses(opCtx, &errmsg, &output, shardResponses)) {
src/mongo/s/commands/cluster_drop_indexes_cmd.cpp:        return appendRawResponses(opCtx,
src/mongo/s/commands/cluster_restart_catalog_command.cpp:        return appendRawResponses(opCtx, &errmsg, &result, shardResponses);

So

  • abortTransaction
  • collMod
  • createIndexes
  • dbStats
  • dropIndexes
  • restartCatalog
Comment by Ravind Kumar (Inactive) [ 28/Sep/18 ]

Small clarification esha.maharishi - is this generally the output format for collMod in a sharded cluster? e.g.:

{
	"raw" : {
		"localhost:4100" : {
			"ok" : 0,
			"errmsg" : "ns does not exist",
			"code" : 26,
			"codeName" : "NamespaceNotFound"
		},
		"localhost:4200" : {
			"expireAfterSeconds_old" : NumberLong(500),
			"expireAfterSeconds_new" : 700,
			"ok" : 1
		}
 
	},
       ...
}

Where the raw document has one document per shard, containing the status and response of the collMod operation on that shard?

Comment by Ian Whalen (Inactive) [ 24/Aug/18 ]

Sending to Sharding under the logic that catalog operations on sharded clusters are typically owned by the sharding team.

CC milkie too.

Comment by Esha Maharishi (Inactive) [ 23/Aug/18 ]

Hi derick,

There is a bit of a fundamental issue in reporting the output of collMod in a cluster (as opposed to a single replica set), which is that collMod is not guaranteed to succeed or fail atomically across a cluster.

Consider the scenario:

1) User sends collMod with "expireAfterSeconds: 5". It succeeds on shardA and shardB, but fails on shardC due to a network error between mongos and shardC. The collMod command reports failure.

2) User ignores the first collMod's failure, and runs another collMod with "expireAfterSeconds: 10". It succeeds on shardA and shardC, but fails on shardB due to a network error between mongos and shardB.

Now the collection in the cluster looks like:

shardA: expireAfterSeconds=10

shardB: expireAfterSeconds=5

shardC: expireAfterSeconds=10

collMod in a cluster reports each shard's individual response to help make the cluster's behavior after the collMod make more sense. Further, collMod in a cluster cannot report the same thing as collMod in a single replica set, because, as you can see above, it cannot report a single "expireAfterSeconds" value that reflects the state of every shard in the cluster.

To avoid this anomalous behavior, it would be nice to make collMod a distributed transaction, which we hope to do at some point.

Hope this helps explain the behavior,
Esha

Generated at Thu Feb 08 04:44:04 UTC 2024 using Jira 9.7.1#970001-sha1:2222b88b221c4928ef0de3161136cc90c8356a66.