[SERVER-7220] need better error messages when a shard is down Created: 01/Oct/12  Updated: 06/Dec/22  Resolved: 21/Mar/18

Status: Closed
Project: Core Server
Component/s: Usability
Affects Version/s: 3.0.0, 3.2.0, 3.4.0
Fix Version/s: None

Type: Improvement Priority: Minor - P4
Reporter: Dwight Merriman Assignee: [DO NOT USE] Backlog - Sharding Team
Resolution: Done Votes: 1
Labels: None
Remaining Estimate: Not Specified
Time Spent: Not Specified
Original Estimate: Not Specified

Assigned Teams:
Sharding
Participants:

 Description   

here we should show the hostname not the IP address. sometimes we do show the hostname. the shard was configured with a logical name no by ip. also it would be nice if we tell you which shard # this is.

mongos> db.foo.ensureIndex({yy:1})
socket exception [SEND_ERROR] for 10.4.1.53:27001
mongos>

also in the below it would be nice to say something like:

"dm_hp:27001" : "Unreachable"

rather than { }. if multiple shards down (imagine i have 300) that would be helpful.

mongos> db.runCommand("dbstats")
{
        "raw" : {
                "s0/dm_hp:27000,dm_hp:27018" : {
                        "db" : "test",
                        "collections" : 3,
                        "objects" : 11,
                        "avgObjSize" : 13073.454545454546,
                        "dataSize" : 143808,
                        "storageSize" : 2310144,
                        "numExtents" : 4,
                        "indexes" : 3,
                        "indexSize" : 24528,
                        "fileSize" : 201326592,
                        "nsSizeMB" : 16,
                        "ok" : 1
                },
                "dm_hp:27001" : {
 
                }
        },
        "ok" : 0,
        "errmsg" : "{ dm_hp:27001: \"result without error message returned : {}\" }"
}



 Comments   
Comment by Gregory McKeon (Inactive) [ 21/Mar/18 ]

We've since rewritten this code path, and the error message has improved.

Comment by Kaloian Manassiev [ 07/Nov/16 ]

With the changes to move mongos towards commands in version 3.2 and higher, the socket errors around createIndex and other administrative commands have been improved so they will at least show the offending host for ease of investigation:

mongos> db.coll.ensureIndex({ Key: 1 });
{
        "raw" : {
                "kaloianmdesktop:20000" : {
 
                }
        },
        "ok" : 0,
        "errmsg" : "{ kaloianmdesktop:20000: \"result without error message returned : {}\" }"
}

I am leaving this ticket open for us to improve the messaging and get rid of the confusing "result without error message returned".

Comment by Kevin J. Rice [ 15/Mar/13 ]

We run with ~50 shards. Compact/helpful err msgs would be nice, esp. if message is not 'unreachable'.

Comment by auto [ 04/Oct/12 ]

Author:

{u'date': u'2012-10-04T13:08:31-07:00', u'email': u'tad@10gen.com', u'name': u'Tad Marshall'}

Message: SERVER-7220 fix Linux compile
Branch: master
https://github.com/mongodb/mongo/commit/0ad42973ebe9b3b184e7c51fa753a4e45e2dbb55

Comment by auto [ 04/Oct/12 ]

Author:

{u'date': u'2012-10-04T10:31:40-07:00', u'email': u'dwight@10gen.com', u'name': u'Dwight'}

Message: SERVER-7220 only an incremental improvement so keep ticket open. add to
$err object a shard: field in certain cases to help user know where the
problem context is. q: Is it ok to add such a field to the $err object?
Branch: master
https://github.com/mongodb/mongo/commit/6e692daaa3b6a37aae62d8d959a8efd1ecb96a94

Generated at Thu Feb 08 03:13:56 UTC 2024 using Jira 9.7.1#970001-sha1:2222b88b221c4928ef0de3161136cc90c8356a66.