[SERVER-2842] Various errors seen when one node dies in a shard. Created: 26/Mar/11  Updated: 08/Mar/13  Resolved: 18/Sep/12

Status: Closed
Project: Core Server
Component/s: Sharding
Affects Version/s: 1.8.0
Fix Version/s: None

Type: Bug Priority: Major - P3
Reporter: Bernie Hackett Assignee: Unassigned
Resolution: Incomplete Votes: 0
Labels: None
Remaining Estimate: Not Specified
Time Spent: Not Specified
Original Estimate: Not Specified
Environment:

Linux x86_64


Issue Links:
Depends
Related
is related to PYTHON-212 pymongo does not recover after stale... Closed
Operating System: ALL
Participants:

 Description   

Repro:

1. Set up two shards, each with one primary, one secondary, and one arbiter.
2. Enable sharding on a database.
3. Shard a collection in the database.
4. Insert enough documents to distribute them between the shards.
5. Run a test that constantly runs various queries against the collection.
6. While running the test kill a random mongod then restart it.

After killing the mongod you may or may not see errors like this:

database error: DBClientBase::findOne: transport error: ...
database error: socket exception
database error: dbclient error communicating with server: ...

These errors are passed back to the client application by mongos. Is this the expected behavior?

This was found while trying to reproduce PYTHON-212.



 Comments   
Comment by Bernie Hackett [ 26/Mar/11 ]

One more: database error: not master and slaveok=false

Comment by Bernie Hackett [ 26/Mar/11 ]

Another error (as reported by pymongo): db assertion failure, assertion: 'setShardVersion failed!

{ "errmsg" : "not master", "ok" : 0 }

', assertionCode: 10429

Generated at Thu Feb 08 03:01:20 UTC 2024 using Jira 9.7.1#970001-sha1:2222b88b221c4928ef0de3161136cc90c8356a66.