Very slow change stream pipeline when using large `ns` filters

    • Type: Bug
    • Resolution: Unresolved
    • Priority: Major - P3
    • None
    • Affects Version/s: 7.0.26, 8.0.16
    • Component/s: None
    • None
    • ALL
    • Hide

      Full steps here: https://github.com/powersync-ja/powersync-service/pull/417

      Essentially:

      1. Get a resumeToken.
      2. Generate a large set of updates in some database (anything other than `test`).
      3. Test a change stream on the `test` db using this command in mongosh:

      db.runCommand({aggregate: 1,pipeline: [{$changeStream: {
      // Use the resume token from step 1 here resumeAfter:

      { _data: "826928363C000000022B0429296E1404" }

              },},{$match: {'ns.db': 'test','ns.coll': {$in: [1, 2, 3, 4, 5, 6, 7, 8, 9, 10, 11, 12, 13, 14, 15, 16, 17, 18, 19,20, 21, 22, 23, 24, 25, 26, 27, 28, 29, 30, 31, 32, 33, 34, 35,36, 37, 38, 39, 40, 41, 42, 43, 44, 45, 46, 47, 48, 49, 50,].map( => `test${i}`),},},}
        ]
      });

      Show
      Full steps here: https://github.com/powersync-ja/powersync-service/pull/417 Essentially: Get a resumeToken. Generate a large set of updates in some database (anything other than `test`). Test a change stream on the `test` db using this command in mongosh: db.runCommand({aggregate: 1,pipeline: [{$changeStream: { // Use the resume token from step 1 here resumeAfter: { _data: "826928363C000000022B0429296E1404" }         },},{$match: {'ns.db': 'test','ns.coll': {$in: [1, 2, 3, 4, 5, 6, 7, 8, 9, 10, 11, 12, 13, 14, 15, 16, 17, 18, 19,20, 21, 22, 23, 24, 25, 26, 27, 28, 29, 30, 31, 32, 33, 34, 35,36, 37, 38, 39, 40, 41, 42, 43, 44, 45, 46, 47, 48, 49, 50,] .map( => `test${i}`),},},}   ] });
    • None
    • None
    • None
    • None
    • None
    • None
    • None

      For my use case, I need to open a single change stream for many individual collections within a cluster - can be as many as 50-100. These collections may or may not be within the same database.

      The obvious solution is to use filter in the style `{$match: {ns: {$in: [

      {db: 'db1', coll: 'a'}

      , {db: 'db1', coll: 'b'}, ...]}}}`. However, with a large number of collections, this significantly slows down the change stream processing on the server, which is not expected.

      A workaround for a single database is to change the filter to the style `{$match: {'ns.coll': {$in: ['a', 'b', ...]}}}`, and there is likely an equivalent for multiple databases, but ideally the `ns $in` filter should be able to achieve the same level performance. In my tests using 50x collections, this version scans through the oplog 10x faster than the first version.

      A specific issue that comes up is that the processing is so slow, there are cases where the aggregation command takes longer than my socket timeout of 60s, resulting an issue that is essentially irrecoverable without restarting the change stream from a new point. Since there is no response returned when this happens, I don't even get a newer resumeToken to start the next request from.

      The issue appears to be in how the underlying pipeline on the oplog is constructed. With the `ns $in` form, filtering on 50 collections results in an `$or` clause with 350+ checks that are all evaluated on every single oplog entry. The `ns.coll $in` form results in a pipeline that is much more efficient.

      Although there are some differences in the oplog pipeline, the same issue is present on both MongoDB 7.0 and 8.0.

      I wrote up more details here, including more details on the use case, how to trigger the really slow cases, and explain output: https://github.com/powersync-ja/powersync-service/pull/417

       

            Assignee:
            Unassigned
            Reporter:
            Ralf Kistner
            Votes:
            0 Vote for this issue
            Watchers:
            1 Start watching this issue

              Created:
              Updated: