Loading...

XML

Word

Printable

JSON

Type: Bug
Resolution: Unresolved
Priority: Major - P3
Fix Version/s: None
Affects Version/s: None
Component/s: None
Labels:
- qi-search

Assigned Teams:

Query Integration
Operating System:
ALL
Confidence Status:
None
Work Order:
3
CAR Domain/s:
None

Aha! Reference:
None
Tracking Level:
None
Risk Status:
None
Exec Notes:
None
Goal Name(s):
None
Goal Link:
None

The scenario is actually pretty specific and hard to reproduce. We have not yet reproduced it but figured by code inspection that the following bug must exist.

Setup:

Collections 'c' and 'target' exist and at least 'target' is sharded.
The sharded cluster has a single shard.
A user issues a query similar to the following:

c.aggregate([{$unionWith: {coll: 'target', pipeline: [{$search: {}}]}}])

- For the following steps to happen in parallel, this is probably going to have to be a particularly slow query, though there's nothing to guarantee any query like this is safe from observing the below events.
A second shard is created and added to the cluster
A chunk or chunks from 'target' is/are migrated to the second shard.
The migrated chunk(s) data is deleted from the original shard.

If that query is running long enough, it will start to miss results from the migrated chunk(s) from 'target'.

A quick note: this seems like a query correctness result. However, this kind of problem is possible with any normal $search in the face of concurrent shard migrations. (TODO separate ticket - couldn't find one, but documented here a bit: https://github.com/10gen/mongot/blob/master/docs/consistency/read-isolation-consistency-recency.md#sharded-cluster)

This would be preventable if the logic which constructed the sub-pipeline instantiated a ScopedCollectionFilter and kept it in the pipeline for the duration of the execution. At the time of this writing, that would be something like adding this line to the body of this helper.

If that theory is correct, the fix is easy but testing this is going to be quite tricky.

is related to

SERVER-96412 tassert tripped on 1-shard sharded $unionWith + $search

Closed

Assignee:: Unassigned
Reporter:: Charlie Swanson
Participants:: Charlie Swanson
Votes:: 0 Vote for this issue
Watchers:: 4 Start watching this issue

Created:: Nov 04 2024 07:08:24 PM UTC
Updated:: Nov 07 2024 07:36:29 PM UTC

Details

Description

Attachments

Issue Links

Forms

Activity

People

Dates