[SERVER-66503] ObjectIsBusy thrown in unindex Created: 16/May/22  Updated: 06/Feb/24  Resolved: 20/Oct/22

Status: Closed
Project: Core Server
Component/s: None
Affects Version/s: None
Fix Version/s: 6.2.0-rc0, 6.0.14

Type: Bug Priority: Major - P3
Reporter: Yuhong Zhang Assignee: Gregory Noma
Resolution: Fixed Votes: 0
Labels: None
Remaining Estimate: Not Specified
Time Spent: Not Specified
Original Estimate: Not Specified

Issue Links:
Backports
Depends
is depended on by SERVER-70503 Do not swallow exceptions when uninde... Backlog
Duplicate
is duplicated by SERVER-66504 ObjectIsBusy thrown in unindex Closed
is duplicated by SERVER-66505 ObjectIsBusy thrown in unindex Closed
Related
is related to SERVER-83930 Investigate Assertion failure: _unind... Closed
Backwards Compatibility: Fully Compatible
Operating System: ALL
Backport Requested:
v6.0
Sprint: Execution Team 2022-06-27, Execution Team 2022-07-11, Execution Team 2022-09-05, Execution Team 2022-09-19, Execution Team 2022-10-03, Execution Team 2022-10-17, Execution Team 2022-10-31
Participants:
Linked BF Score: 135

 Description   

When running initial_sync_move_forward.js, we always get 

"Assertion failure: _unindex failed","attr":{"error":"ObjectIsBusy: 16: Device or resource busy","keyString":"2904403D08FA","namespace":"test.coll","indexName":"_id_"}} 

even though the test passes. 



 Comments   
Comment by Githook User [ 06/Feb/24 ]

Author:

{'name': 'Gregory Noma', 'email': 'gregory.noma@gmail.com', 'username': 'gregorynoma'}

Message: SERVER-66503 Skip unindexing from `_id` index when handling duplicate record for logical initial sync

(cherry picked from commit e73a86e6dd30ea480c8449f77b93973d1fc94218)

GitOrigin-RevId: d39b794e714f18ba3b96b8bca24df954cc5e4320
Branch: v6.0
https://github.com/mongodb/mongo/commit/7956ee3c72385362ec8a87d1bfed29ec090b715c

Comment by Githook User [ 20/Oct/22 ]

Author:

{'name': 'Gregory Noma', 'email': 'gregory.noma@gmail.com', 'username': 'gregorynoma'}

Message: SERVER-66503 Skip unindexing from `_id` index when handling duplicate record for logical initial sync
Branch: master
https://github.com/mongodb/mongo/commit/e73a86e6dd30ea480c8449f77b93973d1fc94218

Comment by Gregory Noma [ 12/Oct/22 ]

When committing the _id index for logical initial sync, we handle a duplicate record by deleting it. This involves removing the corresponding index entries, including from the same _id index. Since we have an open bulk cursor on the _id index table, attempting to open a new cursor for this removal results in an EBUSY which we convert to ObjectIsBusy. This error gets swallowed in SortedDataIndexAccessMethod::removeOneKey, resulting in the aforementioned log message and backtrace.

Comment by Yuhong Zhang [ 15/Sep/22 ]

The stacktrace is really misleading! I think we would still want to investigate why the assertion fails in the first place and confirm whether it's expected. This is expected to be prioritized in the next couple of weeks.

Comment by Daniel Gottlieb (Inactive) [ 15/Sep/22 ]

That's the test evergreen is reported as failing, yes. But that error is basically saying "couldn't connect to a node". That node crashed in an earlier test:

[js_test:initial_sync_move_forward] d20791| 2022-09-14T23:13:13.916+00:00 I  INDEX    20345   [ReplCoordExtern-0] "Index build: done building","attr":{"buildUUID":null,"collectionUUID":{"uuid":{"$uuid":"8e5d68e2-f8e7-47a4-b646-c3a9eafe9520"}},"namespace":"test.coll","index":"x_1","ident":"index-41--4254364152346937010","collectionIdent":"collection-40--4254364152346937010","commitTimestamp":null}
[js_test:initial_sync_move_forward] d20791| 2022-09-14T23:13:14.138+00:00 I  INDEX    20683   [ReplCoordExtern-0] "Assertion failure: _unindex failed","attr":{"error":"ObjectIsBusy: 16: Device or resource busy","keyString":"2904403D08FA","namespace":"test.coll","indexName":"_id_"}
[js_test:initial_sync_move_forward] d20791| 2022-09-14T23:13:14.335+00:00 I  CONTROL  31380   [ReplCoordExtern-0] "BACKTRACE","attr":{"bt":{"backtrace":...
[js_test:initial_sync_move_forward] d20791| 2022-09-14T23:13:14.336+00:00 I  CONTROL  31445   [ReplCoordExtern-0] "Frame","attr":{"frame":{"a":"559680E171D9","b":"55967BE55000","o":"4FC21D9","s":"_ZN5mongo15printStackTraceEv","C":"mongo::printStackTrace()","s+":"29"}}
[js_test:initial_sync_move_forward] d20791| 2022-09-14T23:13:14.336+00:00 I  CONTROL  31445   [ReplCoordExtern-0] "Frame","attr":{"frame":{"a":"55967DB6A148","b":"55967BE55000","o":"1D15148","s":"_ZN5mongo27SortedDataIndexAccessMethod12removeOneKeyEPNS_16OperationContextERKNS_9KeyString5ValueEb.cold.1701","C":"mongo::SortedDataIndexAccessMethod::removeOneKey(mongo::OperationContext*, mongo::KeyString::Value const&, bool) [clone .cold.1701]","s+":"100"}}

The reason why we don't have a BF is that we have seen this "assertion failure" message even in the successful runs and so far there hasn't been a BF really caused by it.

Ah, apologies. I see now. That's not an actual crash. If that's an expected assertion + internal retry and we don't intend on fixing it, can we avoid the stack trace?

Comment by Yuhong Zhang [ 15/Sep/22 ]

daniel.gottlieb@mongodb.com is the failing test jstests/replsets/currentOp_during_automatic_reconfig.js? I couldn't seem to find any logs related to the issue this ticket described. 

The reason why we don't have a BF is that we have seen this "assertion failure" message even in the successful runs and so far there hasn't been a BF really caused by it.

Generated at Thu Feb 08 06:05:37 UTC 2024 using Jira 9.7.1#970001-sha1:2222b88b221c4928ef0de3161136cc90c8356a66.