Uploaded image for project: 'Node.js Driver'
  1. Node.js Driver
  2. NODE-2271

watch API changeStream not notified of closed connection

    • Type: Icon: Bug Bug
    • Resolution: Fixed
    • Priority: Icon: Major - P3 Major - P3
    • 3.5.0
    • Affects Version/s: 3.3.2
    • Component/s: None
    • Environment:
      Ubuntu 18.04.3 LTS

      I have been monitoring a peek-a-boo issue on servers that are relying on `db.watch` API. After an arbitrary period of time the `changeStream` will stop receiving events and none of the following events are fired [ever]:

      const changeStream = db.watch()
      ...
      changeStream.on('error', event => ...)
      changeStream.on('end', event => ...)
      changeStream.on('close', event => ...)
      

      Until recently, I did not have any logs to correlate the reason for the changeStream to stop receiving events so I attempted a band-aid fix arbitrary "disconnects" with:

      setInterval(() => {
        if (changeStream.isClosed())
          throw new Error('changeStream is closed')
      }, 60000)
      

      When the changeStream has stopped receiving events, the above interval call to changeStream.isClosed() is still returning false (thinking the changeStream is still open).

      I discovered (through the help of Sentry error logging) that another server (API web-service) has received this error roughly at the time of the last event received on the above changeStream:

      MongoTimeoutError: Server selection timed out after 30000 ms
       File "/srv/ws-core/node_modules/mongodb/lib/core/sdam/topology.js", line 798, col 7, in selectServers
       new MongoTimeoutError(`Server selection timed out after ${timeout} ms`),
       File "/srv/ws-core/node_modules/mongodb/lib/core/sdam/topology.js", line 872, col 7, in NativeTopology.descriptionChangedHandler
       selectServers(topology, selector, timeout, start, callback);
       File "events.js", line 277, col 13, in Object.onceWrapper
       File "events.js", line 194, col 15, in NativeTopology.emit
       File "domain.js", line 441, col 20, in NativeTopology.EventEmitter.emit
       File "/srv/ws-core/node_modules/mongodb/lib/core/sdam/topology.js", line 524, col 10, in NativeTopology.serverUpdateHandler
       this.emit(
       File "events.js", line 189, col 13, in Server.emit
       File "domain.js", line 441, col 20, in Server.EventEmitter.emit
       File "/srv/ws-core/node_modules/mongodb/lib/core/sdam/monitoring.js", line 182, col 12, in successHandler
       server.emit('descriptionReceived', new ServerDescription(server.description.address, isMaster));
       File "/srv/ws-core/node_modules/mongodb/lib/core/sdam/monitoring.js", line 195, col 7, in checkServer
       successHandler(isMaster);
      

      Because of the business critical logic, this stoppage in changeStream events is recognized within a few hours triggering investigation into why - it has always resulted in the requirement to restart the Node.js app that is creating the changeStream.

      Expected Behavior

      1. I expect an event (`error`, `close`, `timeout`, etc) to be fired when a changeStream is no longer "connected".
      2. I expect `changeStream.isClosed()` to be accurate every time it is called.

            Assignee:
            matt.broadstone@mongodb.com Matt Broadstone
            Reporter:
            cory@therms.io cory robinson
            Votes:
            0 Vote for this issue
            Watchers:
            3 Start watching this issue

              Created:
              Updated:
              Resolved: