-
Type:
Bug
-
Resolution: Unresolved
-
Priority:
Major - P3
-
None
-
Affects Version/s: 7.0.26, 8.0.16
-
Component/s: None
-
None
-
ALL
-
-
None
-
None
-
None
-
None
-
None
-
None
-
None
For my use case, I need to open a single change stream for many individual collections within a cluster - can be as many as 50-100. These collections may or may not be within the same database.
The obvious solution is to use filter in the style `{$match: {ns: {$in: [
{db: 'db1', coll: 'a'}, {db: 'db1', coll: 'b'}, ...]}}}`. However, with a large number of collections, this significantly slows down the change stream processing on the server, which is not expected.
A workaround for a single database is to change the filter to the style `{$match: {'ns.coll': {$in: ['a', 'b', ...]}}}`, and there is likely an equivalent for multiple databases, but ideally the `ns $in` filter should be able to achieve the same level performance. In my tests using 50x collections, this version scans through the oplog 10x faster than the first version.
A specific issue that comes up is that the processing is so slow, there are cases where the aggregation command takes longer than my socket timeout of 60s, resulting an issue that is essentially irrecoverable without restarting the change stream from a new point. Since there is no response returned when this happens, I don't even get a newer resumeToken to start the next request from.
The issue appears to be in how the underlying pipeline on the oplog is constructed. With the `ns $in` form, filtering on 50 collections results in an `$or` clause with 350+ checks that are all evaluated on every single oplog entry. The `ns.coll $in` form results in a pipeline that is much more efficient.
Although there are some differences in the oplog pipeline, the same issue is present on both MongoDB 7.0 and 8.0.
I wrote up more details here, including more details on the use case, how to trigger the really slow cases, and explain output: https://github.com/powersync-ja/powersync-service/pull/417