Loading...

XML

Word

Printable

JSON

Type: Task
Resolution: Unresolved
Priority: Major - P3
Fix Version/s: None
Affects Version/s: None
Component/s: None
Labels:
- dbhash

Assigned Teams:

Storage Execution
CAR Domain/s:
None

Aha! Reference:
None
Tracking Level:
None
Risk Status:
None
Exec Notes:
None
Goal Name(s):
None
Goal Link:
None

Today, dbHash uses an _id scan to iterate through the collection. This is because dbHash uses an order-dependent hash, and since recordIds may differ across nodes, we use _id to maintain the correct ordering of documents.

However, this carries performance impact due to fetching documents with random I/O through the index scan. To improve this, we could consider adding a mode to dbHash that uses a natural order scan with a hashing approach that is not order-dependent.

If we do this, this must be an option that would be switched on/off since we must preserve the order dependent hash for older versions. For ex. if a customer wanted to do a data migration to a newer mongo version and verify data consistency between it and an older version that we do not backport this ticket to, they would want to use the existing dbHash method with _id scan

Assignee:: Unassigned
Reporter:: Xuerui Fa
Participants:: Xuerui Fa
Votes:: 0 Vote for this issue
Watchers:: 6 Start watching this issue

Created:: Sep 02 2025 03:31:47 PM UTC
Updated:: Sep 02 2025 06:31:19 PM UTC

Details

Description

Attachments

Activity

People

Dates