[SERVER-39323] Don't consider in-progress indexes when deciding whether or not to build an index under an intent lock Created: 31/Jan/19 Updated: 29/Oct/23 Resolved: 13/May/19 |
|
| Status: | Closed |
| Project: | Core Server |
| Component/s: | None |
| Affects Version/s: | 4.1.8 |
| Fix Version/s: | 4.1.12 |
| Type: | Bug | Priority: | Major - P3 |
| Reporter: | Shane Harvey | Assignee: | Dianna Hohensee (Inactive) |
| Resolution: | Fixed | Votes: | 0 |
| Labels: | None | ||
| Remaining Estimate: | Not Specified | ||
| Time Spent: | Not Specified | ||
| Original Estimate: | Not Specified | ||
| Issue Links: |
|
||||||||||||||||||||||||||||||||||||
| Backwards Compatibility: | Fully Compatible | ||||||||||||||||||||||||||||||||||||
| Operating System: | ALL | ||||||||||||||||||||||||||||||||||||
| Sprint: | Storage NYC 2019-04-08, Storage NYC 2019-04-22, Storage NYC 2019-05-06 | ||||||||||||||||||||||||||||||||||||
| Participants: | |||||||||||||||||||||||||||||||||||||
| Description |
|
I started seeing two Pymongo test failures on 4.2-latest that I think are related to the hybrid index build project. Both tests pass on 4.1.7 but fail on v4.1.7-176-gd807556. They are listed in PYTHON-1734 but I will put them here too along with how to run each case individually:
The source for this test is here: https://github.com/mongodb/mongo-python-driver/blob/3.7.2/test/test_legacy_api.py#L1266-L1293 And the second failing test:
The source for this test is here: https://github.com/mongodb/mongo-python-driver/blob/3.7.2/test/test_legacy_api.py#L1328-L1354 These failures are reproducible on standalones, replica sets, and sharded clusters. |
| Comments |
| Comment by Dianna Hohensee (Inactive) [ 13/May/19 ] |
|
|
| Comment by Dianna Hohensee (Inactive) [ 01/May/19 ] |
|
I've created |
| Comment by Shane Harvey [ 30/Apr/19 ] |
|
Mongomirror would need to write a retry loop as you say. If we decided to do the same in gridfs (to maintain backward compatibility) all drivers would need to implement the retry loop too. I still think returning IndexBuildAlreadyInProgress would be better than returning OK because at least the error is explicit. Even better would be to implement this retry/join/wait logic in the server and have no downstream changes at all. |
| Comment by Dianna Hohensee (Inactive) [ 30/Apr/19 ] |
|
Thanks for the additional info. It does sound like we'll need to do something about it. shane.harvey, if we were to return an error IndexBuildAlreadyInProgress instead of OK, would that require a lot of downstream work to handle the error correctly? I'd be concerned about callers hammering the server retrying the createIndexes cmd in a simple loop. Callers would have to periodically re-run the cmd if the index build takes a significant amount of time. Index builds say on empty collections should be really quick and retrying the cmd in such a small window should be okay. |
| Comment by Dianna Hohensee (Inactive) [ 22/Apr/19 ] |
|
shane.harvey, what does MongoMirror use the createIndexes cmd for? I thought MongoMirror used applyOps to apply createIndexes oplog entries from one replica set to a new one: I believe that path still uses foreground index builds. |
| Comment by Shane Harvey [ 22/Apr/19 ] |
|
I still think it is a bug to reply with ok:1 before the index is actually built, should I file a new ticket? Ideally, the server would join the second createIndexes thread with the in progress build. There are real world scenario where the same index may be created by multiple threads. For example, I believe mongomirror could be impacted by this bug. Currently it retries createIndexes commands when they fail with network (or other transient) errors. With the buggy behavior in 4.1, a retried createIndexes may "succeed" immediately even though the indexes are not built. This is a problem for two reasons:
Another practical situation where multiple threads may attempt to build the same index is in GridFS. Before the first write operation gridfs will attempt to build two indexes if they do not already exist. GridFS depends on a unique index on files_id to enforce uniqueness. The buggy behavior in 4.1 can cause some threads to violate the unique index on chunks and cause the actual index build to fail with a DuplicateKey error. |
| Comment by Dianna Hohensee (Inactive) [ 22/Apr/19 ] |
|
I'm closing this without changing behavior – I should have filed a separate ticket for the refactor, on reflection, sorry about that. We concluded that there was no good story for what the user would do with an IndexBuildAlreadyInProgress error instead of an OK response. They would not be able to retry the command until successful because that would simply overload the server with requests. Also, we did not think it was a reasonable scenario for a user to be running an index build on one thread, then start up another thread to run the same index build in parallel. |
| Comment by Githook User [ 22/Apr/19 ] |
|
Author: {'name': 'Dianna', 'username': 'DiannaHohensee', 'email': 'dianna.hohensee@10gen.com'}Message:
|
| Comment by Dianna Hohensee (Inactive) [ 05/Apr/19 ] |
|
Eventually for the simul project's join functionality in v4.4: |
| Comment by Dianna Hohensee (Inactive) [ 05/Apr/19 ] |
|
The proposal for v4.2: |
| Comment by Dianna Hohensee (Inactive) [ 05/Apr/19 ] |
|
v4.0 behavior: |
| Comment by Eric Milkie [ 04/Apr/19 ] |
|
Ah, I got it now. When the description says that createIndex succeeds, it is referring to the second invocation of createIndex for the same index. Subsequent to that, the first invocation eventually fails, leaving the index unbuilt forever. |
| Comment by Shane Harvey [ 04/Apr/19 ] |
|
For the first test (test_ensure_index_threaded) the indexes are created eventually. For the second test (test_ensure_unique_index_threaded) the index is never created because the createIndex that's actually waiting for the index build fails with a DuplicateKeyError. |
| Comment by Eric Milkie [ 04/Apr/19 ] |
|
Just to be clear, shane.harvey in your description you say that the index is "never actually created", but we now believe it is simply still building and would actually be created eventually, correct? |
| Comment by Eric Milkie [ 20/Feb/19 ] |
|
Marking this as 4.1 Required and we'll reconsider next steps after the dependent ticket is closed. |
| Comment by Louis Williams [ 13/Feb/19 ] |
|
On second through, I don't know if this is as simple as updating IndexCatalogImpl::_doesSpecConflictWithExisting() to pass "false" for "includeUnfinishedIndexes". Indexes are only ever created with an exclusive collection lock. With this change, when we first check with an IX lock and skip in-progress indexes, we will queue up an X exclusive lock and wait. If the in-progress index yields and the waiting thread gets an X lock, it needs to see that there is an index build in-progress, otherwise it will start building the same index. If this second check sees the in-progress index, it will return early because an identical index exists, in-progress. This would also not solve this problem. I think for now, we should wait to see the outcome of |
| Comment by Louis Williams [ 13/Feb/19 ] |
|
That makes sense to me. Thanks for reporting this. I'll update the description to match the desired behavior: Don't consider in-progress indexes when deciding whether or not to build an index under an intent lock. |
| Comment by Shane Harvey [ 12/Feb/19 ] |
|
I think it's a bug for createIndexes to return ok:1 when the index it refers to is still in progress. So far I'm only seeing this bug on 4.1.7-latest. |
| Comment by Louis Williams [ 11/Feb/19 ] |
|
What you noticed is because of
The change in |
| Comment by Shane Harvey [ 11/Feb/19 ] |
|
Even if the index is built in the background I expect that the createIndex command should block until the index is created, right? From these test failures it looks like a concurrent index build on the same index returns ok:1 immediately even though the index build is still in progress. Will this go away once |
| Comment by Louis Williams [ 11/Feb/19 ] |
|
shane.harvey I just noticed that there is a joinall used before checking the indexes, so I'll see what might be going on there. The index build threads shouldn't return until they are completed. |
| Comment by Louis Williams [ 11/Feb/19 ] |
|
shane.harvey as of My suggestions would be either:
If there is any work we can do on our side, let me know, otherwise I will close this ticket. |