Uploaded image for project: 'Core Server'
  1. Core Server
  2. SERVER-7008

socket exception [SEND_ERROR] on Mongo Sharding

    Details

    • Type: Bug
    • Status: Closed
    • Priority: Blocker - P1
    • Resolution: Fixed
    • Affects Version/s: 2.2.0
    • Fix Version/s: None
    • Component/s: Networking, Sharding
    • Labels:
      None
    • Environment:
      Debian Squeeze / Mongo 2.2.0
    • Operating System:
      ALL

      Description

      Hi i get sometimes the following error, this blocks the whole System. After i restart the Mongod Service it works again well. This was not the first time i get this error.

      uncaught exception: getlasterror failed: {
      	"shards" : [
      		"set01/mongo07.luan.local:10011,mongo08.luan.local:10011"
      	],
      	"ok" : 0,
      	"errmsg" : "could not get last error from a shard set01/mongo07.luan.local:10011,mongo08.luan.local:10011 :: caused by :: socket exception [SEND_ERROR] for 172.16.18.7:10011"
      }

        Issue Links

          Activity

          Hide
          apiggott@ikanow.com Alex Piggott added a comment - - edited

          In case anyone was wondering, since I applied SERVER-9022 to my servers this problem has completely disappeared and I have seen no other ill effects. Thanks!

          Show
          apiggott@ikanow.com Alex Piggott added a comment - - edited In case anyone was wondering, since I applied SERVER-9022 to my servers this problem has completely disappeared and I have seen no other ill effects. Thanks!
          Hide
          apiggott@ikanow.com Alex Piggott added a comment -

          hmm i spoke too soon, i added this fix at the start of feb and from early last week (ie mid march) the same problem started recurring, across more and more nodes (32 nodes running mongos 2.4.9, 20 of them run mongod in 10 shards) over a few days, as before.

          As before, a mass mongos restart fixed everything.

          it looks like you missed a less frequent case of the same sort of problem (before the fix, it happened every 1-2 weeks)?

          Should his issue be reopened, or SERVER-9022, or should i create a new issue?

          Show
          apiggott@ikanow.com Alex Piggott added a comment - hmm i spoke too soon, i added this fix at the start of feb and from early last week (ie mid march) the same problem started recurring, across more and more nodes (32 nodes running mongos 2.4.9, 20 of them run mongod in 10 shards) over a few days, as before. As before, a mass mongos restart fixed everything. it looks like you missed a less frequent case of the same sort of problem (before the fix, it happened every 1-2 weeks)? Should his issue be reopened, or SERVER-9022 , or should i create a new issue?
          Hide
          eliot Eliot Horowitz added a comment -

          Alex - can you open a new ticket (and link here) just to make sure we don't confuse two issues?

          Show
          eliot Eliot Horowitz added a comment - Alex - can you open a new ticket (and link here) just to make sure we don't confuse two issues?
          Hide
          apiggott@ikanow.com Alex Piggott added a comment -

          Created SERVER-13352

          Show
          apiggott@ikanow.com Alex Piggott added a comment - Created SERVER-13352
          Hide
          vishal.katikineni@valuelabs.com Vishal Katikineni (Inactive) added a comment -

          We have a similar issue too even after setting the parameter releaseConnectionsAfterResponse=true.

          Show
          vishal.katikineni@valuelabs.com Vishal Katikineni (Inactive) added a comment - We have a similar issue too even after setting the parameter releaseConnectionsAfterResponse=true.

            Dates

            • Created:
              Updated:
              Resolved: