[SERVER-10582] oplog_all_ops.js fails on Assertion failure, mutex problem when locking ReplicaSetMonitor Created: 20/Aug/13 Updated: 11/Jul/16 Resolved: 22/Aug/13 |
|
| Status: | Closed |
| Project: | Core Server |
| Component/s: | Testing Infrastructure |
| Affects Version/s: | None |
| Fix Version/s: | 2.5.2 |
| Type: | Bug | Priority: | Critical - P2 |
| Reporter: | Matt Kangas | Assignee: | Randolph Tan |
| Resolution: | Done | Votes: | 0 |
| Labels: | buildbot | ||
| Remaining Estimate: | Not Specified | ||
| Time Spent: | Not Specified | ||
| Original Estimate: | Not Specified | ||
| Environment: |
MCI linux-64-debug-duroff |
||
| Backwards Compatibility: | Fully Compatible |
| Operating System: | ALL |
| Participants: |
| Description |
|
I have successfully repro'd this failure on Linux 64 bit debug. MCI says rev 07c221086ff5c2 fails, c735600b22a was good. Will bisect asap. http://mci.10gen.com/ui/task/mongodb_mongo_master_linux_64_debug_duroff_07c221086ff5c20d4d62ba69837d90404459c335_13_08_20_18_54_06_tool_linux_64#logs/task/true
|
| Comments |
| Comment by auto [ 21/Aug/13 ] | ||||||||||||||||||||||||||||||||||
|
Author: {u'username': u'renctan', u'name': u'Randolph Tan', u'email': u'randolph@10gen.com'}Message: Make sure that BackgroundJob is running before calling wait. | ||||||||||||||||||||||||||||||||||
| Comment by auto [ 21/Aug/13 ] | ||||||||||||||||||||||||||||||||||
|
Author: {u'username': u'renctan', u'name': u'Randolph Tan', u'email': u'randolph@10gen.com'}Message: Avoid the mutex lock ordering inconsistency by not holding the _monitorMutex when calling ReplicaSetMonitor::checkAll. | ||||||||||||||||||||||||||||||||||
| Comment by Matt Kangas [ 20/Aug/13 ] | ||||||||||||||||||||||||||||||||||
|
Verified that both of these tests now pass with Randolph's http://codereview.10gen.com/11462033/ applied
| ||||||||||||||||||||||||||||||||||
| Comment by Matt Kangas [ 20/Aug/13 ] | ||||||||||||||||||||||||||||||||||
|
FYI, `jstests/sharding/shard_insert_getlasterror_w2.js` is failing. When trying to repro I saw the same assertion failure.
| ||||||||||||||||||||||||||||||||||
| Comment by Matt Kangas [ 20/Aug/13 ] | ||||||||||||||||||||||||||||||||||
| ||||||||||||||||||||||||||||||||||
| Comment by Matt Kangas [ 20/Aug/13 ] | ||||||||||||||||||||||||||||||||||
|
Ignore my previous git-bisect result. Case of GIGO; all it identified was the compile failure in e54b41cc. Later commit f1f9514 is good. That leaves one of Ren's commits.
| ||||||||||||||||||||||||||||||||||
| Comment by Randolph Tan [ 20/Aug/13 ] | ||||||||||||||||||||||||||||||||||
|
I have an idea of the cause and believe with high confidence that my recent commit broke this. |