[SERVER-69517] moveChunk command fails with "Transaction numbers are only allowed on a replica set member or mongos" on Mongo 4.4 mongos without replica sets Created: 08/Sep/22  Updated: 12/Sep/22  Resolved: 12/Sep/22

Status: Closed
Project: Core Server
Component/s: None
Affects Version/s: 4.4.16
Fix Version/s: None

Type: Bug Priority: Major - P3
Reporter: Ian Springer Assignee: Chris Kelly
Resolution: Done Votes: 0
Labels: None
Remaining Estimate: Not Specified
Time Spent: Not Specified
Original Estimate: Not Specified

Issue Links:
Related
related to DOCS-15625 Clarify that shards should be deploye... Backlog
is related to SERVER-30765 Don't allow txnNumbers in commands fo... Closed
is related to SERVER-41531 Support transactions on standalone in... Closed
Operating System: ALL
Steps To Reproduce:

mongos> db.adminCommand({ setFeatureCompatibilityVersion: '4.4' })
{
    "ok" : 1,
    "operationTime" : Timestamp(1662643030, 1),
    "$clusterTime" : {
        "clusterTime" : Timestamp(1662643030, 1),
        "signature" : {
            "hash" : BinData(0,"AAAAAAAAAAAAAAAAAAAAAAAAAAA="),
            "keyId" : NumberLong(0)
        }
    }
}
mongos> db.adminCommand({"moveChunk": "localtest.manual_User", "find": {"_id": "$minKey"}, "to": "shard0001"})
{
    "ok" : 0,
    "errmsg" : "Transaction numbers are only allowed on a replica set member or mongos",
    "code" : 20,
    "codeName" : "IllegalOperation",
    "operationTime" : Timestamp(1662643375, 6),
    "$clusterTime" : {
        "clusterTime" : Timestamp(1662643375, 6),
        "signature" : {
            "hash" : BinData(0,"AAAAAAAAAAAAAAAAAAAAAAAAAAA="),
            "keyId" : NumberLong(0)
        }
    }
}
mongos> db.adminCommand({ setFeatureCompatibilityVersion: '4.2' })
{
    "ok" : 1,
    "operationTime" : Timestamp(1662643386, 1049),
    "$clusterTime" : {
        "clusterTime" : Timestamp(1662643386, 1049),
        "signature" : {
            "hash" : BinData(0,"AAAAAAAAAAAAAAAAAAAAAAAAAAA="),
            "keyId" : NumberLong(0)
        }
    }
}
mongos> db.adminCommand({"moveChunk": "localtest.manual_User", "find": {"_id": "$minKey"}, "to": "shard0001"})
{
    "millis" : 143,
    "ok" : 1,
    "operationTime" : Timestamp(1662643390, 6),
    "$clusterTime" : {
        "clusterTime" : Timestamp(1662643390, 6),
        "signature" : {
            "hash" : BinData(0,"AAAAAAAAAAAAAAAAAAAAAAAAAAA="),
            "keyId" : NumberLong(0)
        }
    }
} 

Participants:

 Description   

We are in the process of upgrading our sharded Mongo cluster from 4.2 to 4.4. In production, we use replica sets, but in dev/test we do not. In dev/test, after upgrading featureCompatibilityVersion to 4.4, a moveChunk command executed from mongos fails with "Transaction numbers are only allowed on a replica set member or mongos". After downgrading to fcv 4.2, the moveChunk command succeeds. I encounter the same error both from the Java driver and mongo shell - in both cases retryable reads and retryable writes are disabled (we have had retryable reads and writes disabled since we upgraded to 4.0, since with them enabled we encountered similar errors about transactions not being supported without replica sets). The error message implies transaction numbers are allowed on mongos, which is where I am executing the command from. It also is unclear why there is a transaction number in the command in the first place. When I set breakpoints in the Java driver and inspect the moveChunk command at various points prior to it being sent to the server, there is no transaction number.
 
Thanks!



 Comments   
Comment by Chris Kelly [ 12/Sep/22 ]

Thank you for linking the documentation that was confusing on this. I will look into how we can improve the clarity there.

I'll go ahead and close this ticket for now, but if your issues persist, feel free to @mention me with additional information, or open a separate ticket.

Have a great day!
Christopher

Comment by Ian Springer [ 12/Sep/22 ]

Thanks, Chris.

We'll try the 1-node replica sets.

Note, I think the documentation could use some improvement. I didn't come across anything that said sharded clusters must use replica sets for features X, Y, and Z to function. For example, https://www.mongodb.com/docs/v4.4/sharding/ and https://www.mongodb.com/docs/v4.4/core/sharded-cluster-components/ both suggest replica sets are optional for sharded clusters and only needed for HA.

Comment by Chris Kelly [ 12/Sep/22 ]

Hi Ian,

Thanks for your report. To answer your question:

It also is unclear why there is a transaction number in the command in the first place. When I set breakpoints in the Java driver and inspect the moveChunk command at various points prior to it being sent to the server, there is no transaction number.

This is because new features made in MongoDB 3.6 onward assume that shards run as replica sets. Per SERVER-41531, using transactions on standalone mongod's is unsupported, and per SERVER-30765, we disallow transaction numbers in commands on standalone mongod's. Support for standalone mongod shards in sharded clusters is officially removed in MongoDB 5.1 onward.

Specifically, this exists to improve chunk migrations and orphaned document cleanup resiliency during failover. You can read more about this in: Chunk Migration Failover Resiliency Improvements in 4.4

In your case, we recommend initiating your shards as 1-node replica sets to avoid this going forward (even in your dev/test environment).

Let me know if this resolves your issue!

Regards,
Christopher

Generated at Thu Feb 08 06:13:40 UTC 2024 using Jira 9.7.1#970001-sha1:2222b88b221c4928ef0de3161136cc90c8356a66.