[SERVER-45325] mergeByPBRT may cause incompatible error Created: 31/Dec/19  Updated: 15/Nov/21  Resolved: 26/Feb/20

Status: Closed
Project: Core Server
Component/s: Aggregation Framework
Affects Version/s: None
Fix Version/s: None

Type: Bug Priority: Major - P3
Reporter: Zhang Youdong Assignee: Bernard Gorman
Resolution: Won't Fix Votes: 0
Labels: qexec-team
Remaining Estimate: Not Specified
Time Spent: Not Specified
Original Estimate: Not Specified

Operating System: ALL
Sprint: Query 2020-01-27, Query 2020-02-10, Query 2020-03-09
Participants:

 Description   
  1. MongoDB 4.0.7  add mergeByPBRT option between mongos and mongod
  2. if mongos has upgraded to a new version,the normal aggregation command to mongod below 4.0.7 will cause access error.

mongos> db.test.aggregate([{"$match":{"c32":17}}, {"$group":{"_id":null, "total":{"$sum":"$float"}}}])
2019-12-30T19:59:16.398+0800 E QUERY    [js] Error: command failed: {
    "ok" : 0,
    "errmsg" : "unrecognized field 'mergeByPBRT'",
    "code" : 9,
    "codeName" : "FailedToParse",
    "operationTime" : Timestamp(1577707153, 1),
    "$clusterTime" : {
        "clusterTime" : Timestamp(1577707155, 3),
        "signature" : {
            "hash" : BinData(0,"AAAAAAAAAAAAAAAAAAAAAAAAAAA="),
            "keyId" : NumberLong(0)
        }
    }
}

 

Workaround,only add mergeByPBRT option in change stream aggregation command.

diff --git a/src/mongo/s/commands/cluster_aggregate.cpp b/src/mongo/s/commands/cluster_aggregate.cpp
index d58ce76220..a2c308a353 100644
--- a/src/mongo/s/commands/cluster_aggregate.cpp
+++ b/src/mongo/s/commands/cluster_aggregate.cpp
@@ -215,7 +215,9 @@ BSONObj createCommandForTargetedShards(
             targetedCmd[AggregationRequest::kNeedsMergeName] = Value(true);
             // If this is a change stream, set the 'mergeByPBRT' flag on the command. This notifies
             // the shards that the mongoS is capable of merging streams based on resume token.
-            targetedCmd[AggregationRequest::kMergeByPBRTName] = Value(litePipe.hasChangeStream());
+            if (litePipe.hasChangeStream()) {
+                targetedCmd[AggregationRequest::kMergeByPBRTName] = Value(litePipe.hasChangeStream());
+            }
             targetedCmd[AggregationRequest::kCursorName] =
                 Value(DOC(AggregationRequest::kBatchSizeName << 0));
         }



 Comments   
Comment by Bernard Gorman [ 26/Feb/20 ]

Hi zyd_com!

Many thanks for bringing this issue to our attention.

The first point to note is that the only supported path for upgrading a sharded cluster, whether for a minor-version upgrade (e.g. from 4.0.6 to 4.0.7) or a major-version upgrade (e.g. from 4.0 to 4.2) is to upgrade the config servers first, then the shards, and finally the mongoS. When we are writing the code necessary to transition smoothly between versions, we base our decisions on the knowledge that a mongoS should never need to communicate with shards which have not yet been upgraded. In fact, if you attempt to upgrade the mongoS first when upgrading between major releases, it will actually refuse to start until the config servers and all shards are upgraded first.

We aren't quite that strict when upgrading between minor versions, but it is still the case that your cluster should never be running a mongoS with a later version than your shards. In this particular case, we backported a major change stream feature from 4.2 to 4.0.7, which fundamentally changes the way that the mongoS merges events across the cluster. Because the mongoS should always be upgraded last, it sends the mergeByPBRT flag to the shards - which should all be on 4.0.7 at this point - to let them know that it is safe to switch over to the new format. You are correct that applying your proposed patch would stop this exception from being thrown for any non-$changeStream aggregation, but all this would do is hide the fact that the cluster is running in an invalid configuration from the user.

As such, we are going to leave this as-is, and I will close this ticket as Won't Fix.

Thanks again for taking the time to file this report!

Best regards,
Bernard

Comment by Danny Hatcher (Inactive) [ 02/Jan/20 ]

I'll forward this along to the proper team.

Comment by Zhang Youdong [ 02/Jan/20 ]

There is no problem if mongos and mongod running the same version, but I think the compatible issue should be considered during the upgrade, extremely in small version upgrade. There shouldn't appear compatible issue between 4.0.x to 4.0.y in general.

 

 

Comment by Dmitry Agranat [ 31/Dec/19 ]

zyd_com, what happens when both your mongoD and mongoS are running the same MongoDB version, does it solve the issue?

Generated at Thu Feb 08 05:08:29 UTC 2024 using Jira 9.7.1#970001-sha1:2222b88b221c4928ef0de3161136cc90c8356a66.