[SERVER-48137] Use DBLock instead of AutoGetDB in index builds Created: 12/May/20 Updated: 29/Oct/23 Resolved: 14/May/20 |
|
| Status: | Closed |
| Project: | Core Server |
| Component/s: | None |
| Affects Version/s: | None |
| Fix Version/s: | 4.4.0-rc7, 4.7.0 |
| Type: | Bug | Priority: | Major - P3 |
| Reporter: | Louis Williams | Assignee: | Louis Williams |
| Resolution: | Fixed | Votes: | 0 |
| Labels: | None | ||
| Remaining Estimate: | Not Specified | ||
| Time Spent: | Not Specified | ||
| Original Estimate: | Not Specified | ||
| Issue Links: |
|
||||||||
| Backwards Compatibility: | Fully Compatible | ||||||||
| Operating System: | ALL | ||||||||
| Backport Requested: |
v4.4
|
||||||||
| Sprint: | Execution Team 2020-05-18 | ||||||||
| Participants: | |||||||||
| Linked BF Score: | 30 | ||||||||
| Description |
|
AutoGetDB can throw with a StaleDBVersion error. Index builds rely on waiting until critical section to let exceptions throw. Anything thrown before that will cause the server to crash. Use a DBLock then manually check the dbVersion under this exception handler. |
| Comments |
| Comment by Githook User [ 15/May/20 ] |
|
Author: {'name': 'Louis Williams', 'email': 'louis.williams@mongodb.com', 'username': 'louiswilliams'}Message: Index builds do not handle exceptions until a final critical section. AutoGetDB can throw (cherry picked from commit 5dc21b311ba95877eae491f2f3422402bddd8ee0) |
| Comment by Githook User [ 14/May/20 ] |
|
Author: {'name': 'Louis Williams', 'email': 'louis.williams@mongodb.com', 'username': 'louiswilliams'}Message: Index builds do not handle exceptions until a final critical section. AutoGetDB can throw |
| Comment by Louis Williams [ 13/May/20 ] |
|
Great, that makes sense. I'll switch to using DBLock to bypass the dbversion checks until the explicit call when we can handle exceptions. |
| Comment by Kaloian Manassiev [ 13/May/20 ] |
|
Ah, this actually makes more sense now. The reason why we check for DB/Shard version changes as part of the index build is to discover changes to the index composition and abort the index build and force it to start over (at least that's how I understand it - jack.mulrow?). So both the setting of the DB/Shard version must stay in some form. I am not familiar with the concurrency rules around the index build critical section, but lock cannot be dropped and re-taken after this check, then it should be fine to not use AutoGetDb if it causes exception to be thrown at an inopportune time. A cleaner solution would be to just handle the exception. |
| Comment by Louis Williams [ 13/May/20 ] |
|
kaloian.manassiev, it looks like we copy the dbVersion and shardVersion from the client OperationContext to the index builder thread's operation context. This was added as part of |
| Comment by Kaloian Manassiev [ 13/May/20 ] |
|
I don't know why this check is there and more importantly why does it have a DBVersion on the OperationContext at all if it is an internal thread. There must be something higher-up in that code path, which is setting it, or it is not always an internal thread? The database version can change as a result of movePrimary or dropDatabase, which should technically interrupt the index build, but this should happen through the act of dropping the collection, not by doing version checking internally. jack.mulrow, is this something new perhaps due to the "Consistent Indexes" project? |
| Comment by Louis Williams [ 12/May/20 ] |
|
This is happening on an internal thread that is started on behalf of a user "createIndex" operation. There is a manual check for the dbVersion in the critical section that has been there for a very long time, so I assume it's still required? |
| Comment by Kaloian Manassiev [ 12/May/20 ] |
|
Is this error thrown from an internal index build thread or it happens on the user thread? Because internal threads are not supposed to have a DbVersion on the OpContext, so they should never throw this exception. This might be similar to the issue in |