[SERVER-29517] Data race with ViewGraph::_idCounter can corrupt the in-memory ViewGraph Created: 08/Jun/17 Updated: 30/Oct/23 Resolved: 08/Jun/17 |
|
| Status: | Closed |
| Project: | Core Server |
| Component/s: | Querying |
| Affects Version/s: | 3.4.4 |
| Fix Version/s: | 3.4.5, 3.5.9 |
| Type: | Bug | Priority: | Critical - P2 |
| Reporter: | David Storch | Assignee: | David Storch |
| Resolution: | Fixed | Votes: | 0 |
| Labels: | read-only-views | ||
| Remaining Estimate: | Not Specified | ||
| Time Spent: | Not Specified | ||
| Original Estimate: | Not Specified | ||
| Issue Links: |
|
||||||||
| Backwards Compatibility: | Fully Compatible | ||||||||
| Operating System: | ALL | ||||||||
| Backport Requested: |
v3.4
|
||||||||
| Sprint: | Query 2017-06-19 | ||||||||
| Participants: | |||||||||
| Linked BF Score: | 0 | ||||||||
| Description |
|
The ViewGraph is an in-memory directed acyclic graph data structure in which nodes represent view definitions and edges represent "view-on" relationships. This structure assigns unique unsigned 64 bit numbers to each node in the graph, using ViewGraph::_idCounter: The intention is that concurrent access to this counter is prevented by the ViewCatalog's mutex. However, the _idCounter is a static data member. There is a ViewCatalog per database, each owning and synchronizing access to a separate ViewGraph instance. Since the _idCounter is static, all ViewGraph instances share the same counter! This means that the various ViewGraphs can all access the counter simultaneously. This leads to the assignment of invalid node ids, which in turn corrupts the in-memory graph. We have seen this manifest as a process-fatal invariant failure, or as an unexpected failed view catalog operation (e.g. a view drop, modify, or create). |
| Comments |
| Comment by Githook User [ 08/Jun/17 ] | ||||||||||||||
|
Author: {u'username': u'dstorch', u'name': u'David Storch', u'email': u'david.storch@10gen.com'}Message: (cherry picked from commit e2376ccbb43d3fb2579995a55ebf82f7c16fcb4f) | ||||||||||||||
| Comment by David Storch [ 08/Jun/17 ] | ||||||||||||||
|
The invariant failure associated with this problem that we've observed in testing looks like this:
| ||||||||||||||
| Comment by Githook User [ 08/Jun/17 ] | ||||||||||||||
|
Author: {u'username': u'dstorch', u'name': u'David Storch', u'email': u'david.storch@10gen.com'}Message: |