Loading...

XML

Word

Printable

JSON

Type: Bug
Resolution: Unresolved
Priority: Minor - P4
Fix Version/s: None
Affects Version/s: 7.0.26, 8.0.16
Component/s: None
Labels:
None

Assigned Teams:

Query Execution
Operating System:
ALL
Steps To Reproduce:
Hide

Full steps here: https://github.com/powersync-ja/powersync-service/pull/417

Essentially:

Get a resumeToken.

Generate a large set of updates in some database (anything other than `test`).

Test a change stream on the `test` db using this command in mongosh:

db.runCommand({aggregate: 1,pipeline: [{$changeStream: {
// Use the resume token from step 1 here resumeAfter:
{ _data: "826928363C000000022B0429296E1404" }
},},{$match: {'ns.db': 'test','ns.coll': {$in: [1, 2, 3, 4, 5, 6, 7, 8, 9, 10, 11, 12, 13, 14, 15, 16, 17, 18, 19,20, 21, 22, 23, 24, 25, 26, 27, 28, 29, 30, 31, 32, 33, 34, 35,36, 37, 38, 39, 40, 41, 42, 43, 44, 45, 46, 47, 48, 49, 50,].map( => `test${i}`),},},}
]
});
Show
Full steps here: https://github.com/powersync-ja/powersync-service/pull/417 Essentially: Get a resumeToken. Generate a large set of updates in some database (anything other than `test`). Test a change stream on the `test` db using this command in mongosh: db.runCommand({aggregate: 1,pipeline: [{$changeStream: { // Use the resume token from step 1 here resumeAfter: { _data: "826928363C000000022B0429296E1404" } },},{$match: {'ns.db': 'test','ns.coll': {$in: [1, 2, 3, 4, 5, 6, 7, 8, 9, 10, 11, 12, 13, 14, 15, 16, 17, 18, 19,20, 21, 22, 23, 24, 25, 26, 27, 28, 29, 30, 31, 32, 33, 34, 35,36, 37, 38, 39, 40, 41, 42, 43, 44, 45, 46, 47, 48, 49, 50,] .map( => `test${i}`),},},} ] });
CAR Domain/s:
None

Aha! Reference:
None
Tracking Level:
None
Risk Status:
None
Exec Notes:
None
Goal Name(s):
None
Goal Link:
None

For my use case, I need to open a single change stream for many individual collections within a cluster - can be as many as 50-100. These collections may or may not be within the same database.

The obvious solution is to use filter in the style `{$match: {ns: {$in: [

{db: 'db1', coll: 'a'}

, {db: 'db1', coll: 'b'}, ...]}}}`. However, with a large number of collections, this significantly slows down the change stream processing on the server, which is not expected.

A workaround for a single database is to change the filter to the style `{$match: {'ns.coll': {$in: ['a', 'b', ...]}}}`, and there is likely an equivalent for multiple databases, but ideally the `ns $in` filter should be able to achieve the same level performance. In my tests using 50x collections, this version scans through the oplog 10x faster than the first version.

A specific issue that comes up is that the processing is so slow, there are cases where the aggregation command takes longer than my socket timeout of 60s, resulting an issue that is essentially irrecoverable without restarting the change stream from a new point. Since there is no response returned when this happens, I don't even get a newer resumeToken to start the next request from.

The issue appears to be in how the underlying pipeline on the oplog is constructed. With the `ns $in` form, filtering on 50 collections results in an `$or` clause with 350+ checks that are all evaluated on every single oplog entry. The `ns.coll $in` form results in a pipeline that is much more efficient.

Although there are some differences in the oplog pipeline, the same issue is present on both MongoDB 7.0 and 8.0.

I wrote up more details here, including more details on the use case, how to trigger the really slow cases, and explain output: https://github.com/powersync-ja/powersync-service/pull/417

Assignee:: Sebastien Mendez
Reporter:: Ralf Kistner
Participants:: Chris Kelly, Ralf Kistner, Sebastien Mendez
Votes:: 0 Vote for this issue
Watchers:: 2 Start watching this issue

Created:: Nov 28 2025 08:42:07 AM UTC
Updated:: Dec 16 2025 05:24:55 PM UTC

Details

Description

Attachments

Activity

People

Dates