Uploaded image for project: 'Node.js Driver'
  1. Node.js Driver
  2. NODE-2148

Parallel/Concurrent Calls to ReplicaSet fail when replica-set primary node steps-down

    • Type: Icon: Bug Bug
    • Resolution: Gone away
    • Priority: Icon: Major - P3 Major - P3
    • None
    • Affects Version/s: 3.2.7
    • Component/s: None
    • Environment:
      Mongoose v 5.6.12
      NodeJS v8 and v12
    • None
    • None
    • None
    • None
    • None
    • None
    • None

      Currently, I am connected to a 3 node replica-set and performing large number of read/insert operations on the replica-set in parallel. These calls can be performed in 2 possible ways 

      1. Sequential: The second call waits for the first call to finish
      2. Parallel: All the calls since independent of each other work and execute in parallel and are then finally waited upon via Promise.all

      The aim of the task is to try and implement certain retry strategies so as to handle issues which take place during stepdown of the primary replica-set  node as mentioned at https://docs.mongodb.com/manual/reference/method/rs.stepDown/#writes-during-stepdown

       

      The strategy is based on the article published at https://emptysqua.re/blog/how-to-write-resilient-mongodb-applications/ which provides certain retry strategies. 

       

      Now the issue exactly comes up when the Client has been connected and all the requests are fired in parallel, i.e. let's say about a 1000 requests. Now during this time before the calls are sent of if we bring the primary node down using rs.stepDown and all the calls are waiting in Promise.all([call1, call2, call3]) and based on the retry strategy, if we are to handle the errors, then only the first known call returns an error which can be handled. None of the remaining calls throw an error and are just stuck at the original call statement. 

       

      The issue was originally reported on the Mongoose Library Github Page at https://github.com/Automattic/mongoose/issues/8127 but on further debugging and checking out the source I landed upon the following statement:

       

      pool.write(message, commandOptions, commandResponseHandler);
      

      at node_modules/mongodb-core/lib/wireprotocol/command.js at line 94

      This statement neither throws an error or calls the responseHandler in the case of 

      • You performing about 500 find/insert calls in parallel waiting via Promise.all for resolving
      • During the calls being performed, you ask the primary node of replica-set to step down 
      • Which leads to a stand-still code causing the entire execution to halt, since neither of the other calls throw an error except the very first one.

       

            Assignee:
            matt.broadstone@mongodb.com Matt Broadstone
            Reporter:
            manish@wakecap.com Manish Demblani
            None
            Votes:
            0 Vote for this issue
            Watchers:
            2 Start watching this issue

              Created:
              Updated:
              Resolved: