Uploaded image for project: 'Core Server'
  1. Core Server
  2. SERVER-14993

Mongos does not abort non-responsive update operations on primary change

    • Type: Icon: Bug Bug
    • Resolution: Done
    • Priority: Icon: Major - P3 Major - P3
    • None
    • Affects Version/s: 2.6.4
    • Component/s: Sharding
    • Labels:
      None
    • Sharding
    • ALL

      Hi
      During some lab testing to measuring the failover response time and behaviour, we created diferent scenarios:
      1) Graceful failover (replSetStepDown)
      2) Process aborted (killed externally in this simulation by task manager)
      3) System outage (removing network cable)

      We were using a 2 node replica set with an extra arbitrer and a mongos on top of them (servers were started with --shardsvr and --replset). Mongos was running in a linux box, while the mongod were running in windows, all using 2.6.4.

      In scenarios 1 and 2, updates were aborted with errors, and started working once the new primary was elected.

      In the 3rd scenario the client gets stuck in the update command (using c# driver latest stable version). we could overcome the issue by setting the socketTimeoutMS option in the connection string, but in principle the preferred behaviour is the one shown in scenarios 1 and 2.

      Test has been repeted several times to avoid mistakes.

      As a side note times were: 3, 16 and 23 seconds, for the three scenarios.

            Assignee:
            backlog-server-sharding [DO NOT USE] Backlog - Sharding Team
            Reporter:
            jlpedrosa@gmail.com Jose Luis Pedrosa
            Votes:
            0 Vote for this issue
            Watchers:
            8 Start watching this issue

              Created:
              Updated:
              Resolved: