Uploaded image for project: 'Core Server'
  1. Core Server
  2. SERVER-41196

Mongos invariant failure crash with change streams startAfter

    • Fully Compatible
    • ALL
    • v4.2, v4.0
    • Query 2019-06-17, Query 2019-07-01, Query 2019-07-15

      Mongos crashes with this invariant failure:

      2019-05-16T16:29:55.522-0700 F  -        [TaskExecutorPool-0] Invariant failure compareSortKeys(newMinSortKey, *oldMinSortKey, *_params.getSort()) >= 0 src/mongo/s/query/async_results_merger.cpp 525
      2019-05-16T16:29:55.522-0700 F  -        [TaskExecutorPool-0] 
      
      ***aborting after invariant() failure
      
      
      2019-05-16T16:29:55.542-0700 F  -        [TaskExecutorPool-0] Got signal: 6 (Abort trap: 6).
       0x1027b0eb9 0x1027b076d 0x7fff5be3cf5a 0x7fff5bcd83c6 0x7fff5bbda1ae 0x1027a46b2 0x101c8624c 0x101c81ece 0x101c8696b 0x101c8645a 0x101c8905a 0x101b2daf0 0x101df7af4 0x101df63b9 0x101df83ec 0x101df9f2e 0x101df9c09 0x101dfa299 0x101df5e35 0x101df7869 0x101e123a4 0x101e12838 0x101b21b6a 0x101e11e1c 0x101b21b6a 0x101e14cb2 0x101e149c0 0x101e15088 0x101b21b6a 0x101e0d3f1 0x101b21b6a 0x101e63186 0x101b21b6a 0x101e62c81 0x101b21b6a 0x101e62480 0x101b21b6a 0x101e61d71 0x101b21b6a 0x101e52381 0x101b21b6a 0x101e40640 0x101b21b6a 0x101e3ff0a 0x101e4161a 0x101e414c9 0x10215c421 0x102151d32 0x102151ba9 0x101e57a74 0x101e07398 0x101e0db1f 0x7fff5be46661 0x7fff5be4650d 0x7fff5be45bf9
      ----- BEGIN BACKTRACE -----
      ...
      

      Attached is mongos-python-1720-crash.log which includes the full mongos log.

      The reproduction is fairly simple:

      1. Start a change stream on collection "x", drop "x", and obtain the invalidate change stream document's resume token
      2. Start a new change stream passing the previous resume token to startAfter.
      3. Run a getMore on the new change stream -> crash.

      In code:

      MongoDB Enterprise mongos> var cs = db.test.watch([{'$match': {'operationType': 'invalidate'}}]);
      MongoDB Enterprise mongos> db.test.insertOne({});
      {
      	"acknowledged" : true,
      	"insertedId" : ObjectId("5cddf66ab38812eee363f373")
      }
      MongoDB Enterprise mongos> db.test.drop();
      true
      MongoDB Enterprise mongos>
      MongoDB Enterprise mongos> var doc = cs.next();
      MongoDB Enterprise mongos> var resume_token = doc['_id'];
      MongoDB Enterprise mongos>
      MongoDB Enterprise mongos> var cs = db.test.watch([], {startAfter: resume_token});
      MongoDB Enterprise mongos> cs.next();
      2019-05-16T16:46:51.624-0700 E  QUERY    [js] uncaught exception: Error: error doing query: failed: network error while attempting to run command 'getMore' on host '127.0.0.1:27018'  :
      DB.prototype.runCommand@src/mongo/shell/db.js:170:23
      DBCommandCursor.prototype._runGetMoreCommand@src/mongo/shell/query.js:803:18
      DBCommandCursor.prototype._hasNextUsingCommands@src/mongo/shell/query.js:836:9
      DBCommandCursor.prototype.hasNext@src/mongo/shell/query.js:844:16
      DBCommandCursor.prototype.next@src/mongo/shell/query.js:863:14
      @(shell):1:1
      2019-05-16T16:46:51.626-0700 I  NETWORK  [js] trying reconnect to 127.0.0.1:27018 failed
      2019-05-16T16:46:51.626-0700 I  NETWORK  [js] reconnect 127.0.0.1:27018 failed failed
      2019-05-16T16:46:51.629-0700 I  NETWORK  [js] trying reconnect to 127.0.0.1:27018 failed
      2019-05-16T16:46:51.630-0700 I  NETWORK  [js] reconnect 127.0.0.1:27018 failed failed
      

      I'm using version:

      mongos version v4.1.11-61-g7e1682c
      git version: 7e1682c579f0b719fd4988e04b9b63eea0ebd03c
      allocator: system
      modules: enterprise
      build environment:
          distarch: x86_64
          target_arch: x86_64
      

            Assignee:
            bernard.gorman@mongodb.com Bernard Gorman
            Reporter:
            shane.harvey@mongodb.com Shane Harvey
            Votes:
            0 Vote for this issue
            Watchers:
            7 Start watching this issue

              Created:
              Updated:
              Resolved: