[SERVER-57136] Incompatible wire version error on secondary shutdown in sharded cluster Created: 21/May/21  Updated: 29/Oct/23  Resolved: 09/Jun/21

Status: Closed
Project: Core Server
Component/s: None
Affects Version/s: 4.4.6
Fix Version/s: 4.4.7, 5.0.0-rc1, 5.1.0-rc0

Type: Bug Priority: Major - P3
Reporter: Eric Adum (Inactive) Assignee: Max Hirschhorn
Resolution: Fixed Votes: 1
Labels: sharding-wfbf-day
Remaining Estimate: Not Specified
Time Spent: Not Specified
Original Estimate: Not Specified

Issue Links:
Backports
Depends
is depended on by NODE-3136 "Incompatible wire version" error wit... Closed
Related
is related to SERVER-50170 Fix server selection failure on mongos Closed
Backwards Compatibility: Fully Compatible
Operating System: ALL
Backport Requested:
v5.0, v4.4
Steps To Reproduce:

1. Start a sharded cluster, e.g. with mlaunch init --replicaset --sharded 1 --nodes 3 --mongos 1
2. Kill one of the secondary nodes, e.g. using rs.status() to find a secondary and ps/kill -9 to find the PID and terminate it
3. Run the provided reproduction script for Node and/or Python

This should produce the following error from the server on the next find operation:
Encountered non-retryable error during query :: caused by :: Incompatible wire version

Node repro script

'use strict';
 
const MongoClient = require('mongodb').MongoClient;
 
const uri = 'mongodb://localhost:27017/?readPreference=secondary&maxStalenessSeconds=90';
const client = new MongoClient(uri);
 
async function run() {
  try {
    await client.connect();
    const database = client.db('test');
    const collection = database.collection('bar');
    for (let i = 0; i < 10000; i++) {
      const doc = { a: 42 + i };
      console.log(await collection.insertOne(doc));
      console.log(await collection.findOne(doc));
    }
  } catch (err) {
    console.error(err);
  } finally {
    await client.close();
  }
}
run().catch(console.dir);

Python repro script

from pymongo import MongoClient
import pprint
 
client = MongoClient('mongodb://localhost:27017/?readPreference=secondary&maxStalenessSeconds=90')
db = client.test
coll = db.collection.bar
i = 0
while i < 10000:
  doc = { 'a': 42 + i }
  pprint.pprint(coll.insert_one(doc))
  pprint.pprint(coll.find_one(doc))
  i += 1

Sprint: Sharding 2021-06-14
Participants:
Case:

 Description   

On a MongoDB v4.4.1+ sharded cluster, when a secondary is taken down, certain operations that should succeed result in an error from the server when the connection string includes readPreference=secondary&maxStalenessSeconds=90.

The error looks to be a result of SERVER-50170, specifically this new check.



 Comments   
Comment by Githook User [ 09/Jun/21 ]

Author:

{'name': 'Max Hirschhorn', 'email': 'max.hirschhorn@mongodb.com', 'username': 'visemet'}

Message: SERVER-57136 Skip maxStaleness wire version check when server is down.

(cherry picked from commit d7bd51c05a7ac49afd7ed3b394fde123383af6de)
Branch: v5.0
https://github.com/mongodb/mongo/commit/f9947fe6a9f8066fff5dc4bc646683e15ea0121a

Comment by Githook User [ 08/Jun/21 ]

Author:

{'name': 'Max Hirschhorn', 'email': 'max.hirschhorn@mongodb.com', 'username': 'visemet'}

Message: SERVER-57136 Skip maxStaleness wire version check when server is down.

(cherry picked from commit d7bd51c05a7ac49afd7ed3b394fde123383af6de)
Branch: v4.4
https://github.com/mongodb/mongo/commit/e92664940640287874bf6d08940df26fee7bc2b8

Comment by Githook User [ 08/Jun/21 ]

Author:

{'name': 'Max Hirschhorn', 'email': 'max.hirschhorn@mongodb.com', 'username': 'visemet'}

Message: SERVER-57136 Skip maxStaleness wire version check when server is down.
Branch: master
https://github.com/mongodb/mongo/commit/d7bd51c05a7ac49afd7ed3b394fde123383af6de

Generated at Thu Feb 08 05:41:04 UTC 2024 using Jira 9.7.1#970001-sha1:2222b88b221c4928ef0de3161136cc90c8356a66.