[SERVER-21001] MongoDb hang and then crash Created: 19/Oct/15 Updated: 04/Nov/15 Resolved: 30/Oct/15 |
|
| Status: | Closed |
| Project: | Core Server |
| Component/s: | WiredTiger |
| Affects Version/s: | 3.2.0-rc0 |
| Fix Version/s: | 3.2.0-rc2 |
| Type: | Bug | Priority: | Major - P3 |
| Reporter: | Nick Judson | Assignee: | Martin Bligh |
| Resolution: | Done | Votes: | 0 |
| Labels: | None | ||
| Remaining Estimate: | Not Specified | ||
| Time Spent: | Not Specified | ||
| Original Estimate: | Not Specified | ||
| Attachments: |
|
| Backwards Compatibility: | Fully Compatible |
| Operating System: | ALL |
| Steps To Reproduce: | I will try and repro it locally. |
| Participants: |
| Description |
|
Running my typical workload with the following MongoDb: C:\Program Files\MongoDB\Server\3.2\bin>mongod --dbpath=d:\mongo --wiredTigerJournalCompressor=zlib --wiredTigerCollectionBlockCompressor=zlib --wiredTigerCache |
| Comments |
| Comment by Githook User [ 30/Oct/15 ] | ||||||||||||
|
Author: {u'username': u'martinbligh', u'name': u'Martin Bligh', u'email': u'mbligh@mongodb.com'}Message: | ||||||||||||
| Comment by Martin Bligh [ 19/Oct/15 ] | ||||||||||||
|
No worries, think we might have figured it out from your stack trace. | ||||||||||||
| Comment by Nick Judson [ 19/Oct/15 ] | ||||||||||||
|
Martin - I haven't been able to repro it - sadly I wasn't paying enough attention to what I was doing when it crashed. My guess is that it was creating & indexing collections while working hard filling up other existing collections. Sorry... | ||||||||||||
| Comment by Martin Bligh [ 19/Oct/15 ] | ||||||||||||
|
Hi Nick, I did some refactoring around this code for performance reasons between 3.1.8 and 3.2.0-rc0. Not sure how easy this is for you to reproduce, but if there's any way you could try 3.1.8 or give us a description of what your workload does that we can try to repro locally, that would be very useful. Thanks, M. | ||||||||||||
| Comment by Mark Benvenuto [ 19/Oct/15 ] | ||||||||||||
|
The cause of the crash was a WriteConfllictException thrown during Database::createCollection during insertOne. As part of WriteBatchExecutor::ExecInsertsState::lockAndCheck called at
If Database::createCollection throws at the line 934, it will leave WriteBatchExecutor::ExecInsertsState with the following state:
In this case, the _collLock will be a MODE_X which is necessary for a call to Database::createCollection. Finally, we know a WriteConflictException was thrown here because of the WCE loop counter:
|