Loading...

XML

Word

Printable

JSON

Type: Bug
Resolution: Fixed
Priority: Major - P3
Fix Version/s: 8.1.0-rc0
Affects Version/s: 7.0.0, 8.1.0-rc0, 8.0.0
Component/s: None
Labels:
None

Assigned Teams:

Cluster Scalability
Backwards Compatibility:
Fully Compatible
Operating System:
ALL
Backport Requested:

v8.0, v7.0
Confidence Status:
None
Work Order:
3
CAR Domain/s:
None

Aha! Reference:
None
Tracking Level:
None
Risk Status:
None
Exec Notes:
None
Goal Name(s):
None
Goal Link:
None

In the function where we perform QueryAnalysisWriter::_flush, it looks like the logic to update the invalid set with the index of the bad doc is incorrect, this can result in a valid, invalid or garbage index value added. This stems from the incorrect update of baseIndex value. Scenario is explained below:

Consider 6 docs in the buffer with maxBatchSize of 2:
[D0 (throws BSONObjectTooLarge when inserting), D1 (Duplicate of D4), D2, D3 (Duplicate of D4), D4, D5]

Init
lastIndex = 6
baseIndex = 5

Iteration 1:
docsToInsert: [D5, D4] (We read from back of the buffer)
tmpBuffer: [D0, D1, D2, D3]
lastIndex = 4
baseIndex = 1

Iteration 2:
docsToInsert: [D3, D2]
tmpBuffer: [D0, D1]
lastIndex = 2
invalid.insert(baseIndex - err.getIndex()) = invalid.insert(1-0) => invalid = {1}
baseIndex = 1-4 = 18446744073709551615 (unsigned long)

Iteration 3:
docsToInsert: [D1, D0]
lastIndex = 0
tmpBuffer: [D1] => This document is added back to the buffer which was a duplicate ID.

Reproducer is attached.

- - Sort By Name
  - Sort By Date
  - Ascending
  - Descending
  - Thumbnails
  - List
  - Download All

repro.patch
3 kB
Dec 24 2024 03:43:41 AM UTC

Assignee:: Cheahuychou Mao
Reporter:: Abdul Qadeer
Participants:: Abdul Qadeer, Cheahuychou Mao, Githook User
Votes:: 0 Vote for this issue
Watchers:: 4 Start watching this issue

Created:: Dec 24 2024 03:46:41 AM UTC
Updated:: Feb 25 2025 10:11:38 PM UTC
Resolved:: Jan 04 2025 12:11:18 AM UTC

Details

Description

Attachments

Attachments

Activity

People

Dates