Loading...

XML

Word

Printable

JSON

Type: Bug
Resolution: Done
Priority: Major - P3
Fix Version/s: None
Affects Version/s: None
Component/s: Querying
Labels:
None

Assigned Teams:

Server Triage
Operating System:
ALL
Confidence Status:
None
Work Order:
3
CAR Domain/s:
None

Aha! Reference:
None
Tracking Level:
None
Risk Status:
None
Exec Notes:
None
Goal Name(s):
None
Goal Link:
None

I want to use "changes stream" to listen to all the change events from source sharding, and apply this data into target mongodb so that the target mongodb is just like a secondary of the primary. However, after I go through the source code, I think the changes stream can't keep casual consistency when move chunk happens. For a better explanation, I draw a picture as the attachment.

In this picture, there are two shards, MongoS builds two cursors to fetch change stream event which is a transformation of oplog from MongoD. After that, MongoS runs the merge stage by "ts+uuid+documentKey" to sort the key. Everything is ok event move chunk happens because of the current hybrid logical time can keep the casual consistency. However, there is one corn case that may raise a bug: what if shard2 fetching speed is so fast that shard1, so that in my move chunk case, once shard2 has already fetch after "ts=A2", but shard1 hasn't reach "ts=A1". After shard1 catches up, the writing order of key A is "set a = 2" and then "set a = 1", so the result of A is 2, but the real result is 1.

There may be another case that lost data which isnot related to the previous example:
Since v3.6, MongoDB uses the logical time as timestamp which keeps the causal consistency. So different MongoS may have a different timestamp. In the change stream, MongoS uses local logical time if no afterClusterTime options given, will this cause some data loss? For example, mongos1 timestamp is 10:00, mongos2 is 10:02, shard1 is 09:59, shard2 is 10:01, if users send change stream command to mongos1, then mongos1 will send the aggregate command to each shard begin with 10:00, and the shard1 oplog from 09:59~10:00 will lost. I also find a jira ~~SERVER-31767~~(https://jira.mongodb.org/browse/SERVER-31767). It means this problem has been solved since v4.1.1 by global point time, right?

In my point of view, it'll be useful to add a "wait policy" on MongoS, for example, shard2 returns events with ts=05:10, after that MongoS cache it until all oplogs from other shards older than 05:10 to be received, and then, reply to user.

- - Sort By Name
  - Sort By Date
  - Ascending
  - Descending
  - Thumbnails
  - List
  - Download All

example.png
250 kB
Apr 15 2020 03:43:31 AM UTC

Assignee:: [HELP ONLY] Backlog - Triage Team
Reporter:: vinllen chen
Participants:: [HELP ONLY] Backlog - Triage Team, Carl Champain, vinllen chen
Votes:: 0 Vote for this issue
Watchers:: 7 Start watching this issue

Created:: Apr 15 2020 03:56:01 AM UTC
Updated:: Dec 06 2022 02:29:46 AM UTC
Resolved:: Apr 15 2020 11:33:14 PM UTC

Details

Description

Attachments

Attachments

Activity

People

Dates