[SERVER-17018] Assertion failure false src/mongo/db/structure/btree/key.cpp 433 on remove operation Created: 23/Jan/15  Updated: 04/Jun/15  Resolved: 06/May/15

Status: Closed
Project: Core Server
Component/s: Index Maintenance
Affects Version/s: 2.6.9
Fix Version/s: 2.6.10

Type: Bug Priority: Major - P3
Reporter: Yuki Okamoto Assignee: Geert Bosch
Resolution: Done Votes: 0
Labels: None
Remaining Estimate: Not Specified
Time Spent: Not Specified
Original Estimate: Not Specified
Environment:

OS: Windows Server 2008 R2 Standard SP1 64bit
CPU: Intel(R) Core(TM)2 Quad CPU Q9550 @2.83GHz
RAM: 6.00GB
MongoDB Version: 2.6.5 and 2.6.7


Attachments: File OperationCycle.js     Zip Archive linux_repro.zip     Zip Archive linux_repro.zip     Zip Archive reproduction.zip     File setting.js     File test.bat    
Backwards Compatibility: Fully Compatible
Operating System: Windows
Steps To Reproduce:
  1. execute "test.bat".
  2. wait until "test.bat" finished.
    (about 1hr on my PC environment(refer to "Environment")))
  3. find error log by executing following command.

     find /N "key.cpp" mongo.log

Participants:

 Description   

I got this assertion error when execute remove operation.
You can get the error by running the attached tool.
Could you investigate this error using the tool?

After doing procedure(refer to "Steps To Reproduce"), you can find the error log like below.

testTenant2.testBucket0 Assertion failure false src\mongo\db\structure\btree\key.cpp 433
2015-01-22T15:38:00.045+0900 [conn65] mongod.exe  ...\src\mongo\util\stacktrace.cpp(169)                             mongo::printStackTrace+0x43
2015-01-22T15:38:00.045+0900 [conn65] mongod.exe  ...\src\mongo\util\log.cpp(127)                                    mongo::logContext+0x9c
2015-01-22T15:38:00.045+0900 [conn65] mongod.exe  ...\src\mongo\util\assert_util.cpp(115)                            mongo::verifyFailed+0x14a
2015-01-22T15:38:00.045+0900 [conn65] mongod.exe  ...\src\mongo\db\structure\btree\key.cpp(433)                      mongo::KeyV1::toBson+0x34e
2015-01-22T15:38:00.045+0900 [conn65] mongod.exe  ...\src\mongo\db\index\btree_interface.cpp(168)                    mongo::BtreeInterfaceImpl<mongo::BtreeData_V1>::keyAt+0x16b
2015-01-22T15:38:00.045+0900 [conn65] mongod.exe  ...\src\mongo\db\index\btree_index_cursor.cpp(175)                 mongo::BtreeIndexCursor::getKey+0x5d
2015-01-22T15:38:00.045+0900 [conn65] mongod.exe  ...\src\mongo\db\exec\index_scan.cpp(314)                          mongo::IndexScan::checkEnd+0x1a1
2015-01-22T15:38:00.045+0900 [conn65] mongod.exe  ...\src\mongo\db\exec\index_scan.cpp(150)                          mongo::IndexScan::work+0x6f
2015-01-22T15:38:00.045+0900 [conn65] mongod.exe  ...\src\mongo\db\exec\fetch.cpp(82)                                mongo::FetchStage::work+0x99
2015-01-22T15:38:00.045+0900 [conn65] mongod.exe  ...\src\mongo\db\exec\keep_mutations.cpp(60)                       mongo::KeepMutationsStage::work+0x55
2015-01-22T15:38:00.045+0900 [conn65] mongod.exe  ...\src\mongo\db\query\plan_executor.cpp(91)                       mongo::PlanExecutor::getNext+0x16b
2015-01-22T15:38:00.045+0900 [conn65] mongod.exe  ...\src\mongo\db\ops\delete_executor.cpp(154)                      mongo::DeleteExecutor::execute+0x866
2015-01-22T15:38:00.045+0900 [conn65] mongod.exe  ...\src\mongo\db\commands\write_commands\batch_executor.cpp(1201)  mongo::multiRemove+0x589
2015-01-22T15:38:00.045+0900 [conn65] mongod.exe  ...\src\mongo\db\commands\write_commands\batch_executor.cpp(887)   mongo::WriteBatchExecutor::execRemove+0xad
2015-01-22T15:38:00.045+0900 [conn65] mongod.exe  ...\src\mongo\db\commands\write_commands\batch_executor.cpp(743)   mongo::WriteBatchExecutor::bulkExecute+0x29b
2015-01-22T15:38:00.045+0900 [conn65] mongod.exe  ...\src\mongo\db\commands\write_commands\batch_executor.cpp(249)   mongo::WriteBatchExecutor::executeBatch+0xb6c
2015-01-22T15:38:00.045+0900 [conn65] mongod.exe  ...\src\mongo\db\commands\write_commands\write_commands.cpp(146)   mongo::WriteCmd::run+0x1e6
2015-01-22T15:38:00.045+0900 [conn65] mongod.exe  ...\src\mongo\db\dbcommands.cpp(1385)                              mongo::_execCommand+0x5e
2015-01-22T15:38:00.045+0900 [conn65] mongod.exe  ...\src\mongo\db\dbcommands.cpp(1592)                              mongo::Command::execCommand+0xf09
2015-01-22T15:38:00.045+0900 [conn65] mongod.exe  ...\src\mongo\db\dbcommands.cpp(1724)                              mongo::_runCommands+0x4a7
2015-01-22T15:38:00.045+0900 [conn65] testTenant2.testBucket0 
2015-01-22T15:38:00.045+0900 [conn65] remove testTenant2.testBucket0 query: { _id: ObjectId('54c09ac7a68d879e3aaf086a'), var1: "str1" } ndeleted:0 keyUpdates:0 exception: assertion src\mongo\db\structure\btree\key.cpp:433 code:8 numYields:0 locks(micros) w:694341 694ms

Frequency of occurrence is:

  • 4 in 4 times (tool execute times)

I attached the tool written above:

  • test.bat
    ... Execute "OperationCycle.js" many times at the same time.
  • OperationCycle.js
    ... Repeat executing Mongo operations(insert, findOne, update, remove).
  • Setting.js
    ... Set parameters.
    You don't need to modify these files.

This tool means:
Several users execute Mongo Operations(insert, find, update, remove) to several colletions in several databases at the same time.

I think the important point to occur this error is to execute index scan stage in remove operation by setting "two" items in remove operation's query.
For example:

db.coll.remove({ "_id" : testId , "key": "val"}

(setting two items : "_id" and "key")



 Comments   
Comment by Yifeng Zhou [ 12/May/15 ]

Hi Ramon, I appreciate your reply. I understood 3.0.x versions are not affected by this issue.

Comment by Ramon Fernandez Marina [ 11/May/15 ]

zhouyifeng, as per my earlier comment we were unable to reproduce the issue in 3.0.2, so the 3.0.x versions should not be affected by this issue.

Update
The code that triggered this issue has been rewritten in 3.0.x, so 3.0.x versions are not affected by this issue.

Comment by Yifeng Zhou [ 11/May/15 ]

I'm a coworker of Yuki Okamoto. Thank you for update patch. We will test the effect in 2.6.x.
In 3.0.x, it will be affected by this issue?

Comment by Githook User [ 06/May/15 ]

Author:

{u'username': u'GeertBosch', u'name': u'Geert Bosch', u'email': u'geert@mongodb.com'}

Message: SERVER-17018: Only invalidate cursors pointing to the same index
Branch: v2.6
https://github.com/mongodb/mongo/commit/bb112233c8b7717a8f83501fc8d992c07731f817

Comment by Ramon Fernandez Marina [ 29/Apr/15 ]

y-okamoto, I'm able to reproduce the behavior you describe on the 2.6 series (tried 2.6.7 and 2.6.9), but so far not on 3.0.2. It is possible that this issue was fixed as part of the work that went into the 3.0 series, so would it be possible for you to test with 3.0.2?

We'll investigate the issue in 2.6, but if you're evaluating MongoDB for a new installation I'd recommend you consider 3.0.2 anyway.

Thanks,
Ramón.

Comment by Yuki Okamoto [ 27/Apr/15 ]

Thank you for trying reproducing this issue.
I attach data files that I run into this issue.
Also I attach the result of "validate()" command.

reproduction.zip

・testTenant0.0 — testTenant0.ns
...data files of DB named "testTenant0"

・validate.txt
...the result of "validate()" command.

・mongo.log
...log data for reference

In validate.txt, "nrecords" should be "1000000" but it's "1000002".
This means that 2 records couldn't be deleted.
(For example, 'remove operation' tried to remove record that has
"ObjectId('553de0b9eb436adf7aefd842')", but failed.)

Environment:
OS: Windows Server 2008 R2 Standard SP1 64bit
CPU: Intel(R) Core(TM)2 Quad CPU Q9550 @2.83GHz
RAM: 6.00GB
MongoDB Version: 2.6.7

Regards,
Yuki

Comment by Sam Kleinman (Inactive) [ 08/Apr/15 ]

Thanks for this report. I'm working on reproducing this issue, but haven't yet hit this assertion.

Assertions in this part code are often related to data integrity. To continue to debug this issue, it would be very helpful if we could get a copy of your data files that you have run into this issue with. Also consider running validate() on the affected collections and check the underlying storage devices and disks.

Thanks for your patience and I hope we can get to the bottom of this soon.

Regards,
sam

Generated at Thu Feb 08 03:43:02 UTC 2024 using Jira 9.7.1#970001-sha1:2222b88b221c4928ef0de3161136cc90c8356a66.