[SERVER-39705] IndexBuildInterceptor does not faithfully preserve multikey when a document generates no keys Created: 21/Feb/19 Updated: 29/Oct/23 Resolved: 16/Jun/19 |
|
| Status: | Closed |
| Project: | Core Server |
| Component/s: | Storage |
| Affects Version/s: | None |
| Fix Version/s: | 4.2.0-rc3, 4.3.1 |
| Type: | Bug | Priority: | Major - P3 |
| Reporter: | Daniel Gottlieb (Inactive) | Assignee: | Benety Goh |
| Resolution: | Fixed | Votes: | 0 |
| Labels: | KS | ||
| Remaining Estimate: | Not Specified | ||
| Time Spent: | Not Specified | ||
| Original Estimate: | Not Specified | ||
| Issue Links: |
|
||||||||||||||||||||||||
| Backwards Compatibility: | Fully Compatible | ||||||||||||||||||||||||
| Operating System: | ALL | ||||||||||||||||||||||||
| Backport Requested: |
v4.2
|
||||||||||||||||||||||||
| Sprint: | Storage NYC 2019-02-25, Storage NYC 2019-05-20, Execution Team 2019-06-03, Execution Team 2019-06-17 | ||||||||||||||||||||||||
| Participants: | |||||||||||||||||||||||||
| Case: | (copied to CRM) | ||||||||||||||||||||||||
| Linked BF Score: | 57 | ||||||||||||||||||||||||
| Story Points: | 8 | ||||||||||||||||||||||||
| Description |
|
IndexBuildInterceptor makes an incorrect assumption that a document must generate keys to be considered multikey. In particular, sparse compound indexes may not generate keys, but will consider a document to be multikey[1]. MongoDB's validation code is strict and will compare an index's multikey to the multikey output of every document. [1] Consider the index {a: 1, b: "2dsphere"} (2dsphere makes an index "auto-sparse"). Consider the document {_id: 1, a: [1,2]}. Because b is omitted, the sparse-ness will result in no index keys being generated. However, because a is an array, that field of the compound index will be considered to be multikey. |
| Comments |
| Comment by Githook User [ 08/Jul/19 ] |
|
Author: {'name': 'Benety Goh', 'email': 'benety@mongodb.com', 'username': 'benety'}Message: This re-applies commit 5bd904dff90a0e6332d6d4630053141e6617c5de with additional (cherry picked from commit 62e3fdae6062cf1fe5e55932eb6aa26f0f593d17) |
| Comment by Githook User [ 07/Jul/19 ] |
|
Author: {'name': 'Benety Goh', 'username': 'benety', 'email': 'benety@mongodb.com'}Message: This applies to hybrid index builds for partial indexes. (cherry picked from commit ea632e0e020c98e39b9c86f4e1f78fba7841e792) |
| Comment by Githook User [ 07/Jul/19 ] |
|
Author: {'name': 'Benety Goh', 'username': 'benety', 'email': 'benety@mongodb.com'}Message: (cherry picked from commit 17ec6e3bd06770c23090f4287adce13a5301a4d7) |
| Comment by Githook User [ 07/Jul/19 ] |
|
Author: {'name': 'Benety Goh', 'username': 'benety', 'email': 'benety@mongodb.com'}Message: (cherry picked from commit 4d4892bedbea98ff0bfba3f5d8443ca911877e21) |
| Comment by Githook User [ 06/Jul/19 ] |
|
Author: {'name': 'Benety Goh', 'username': 'benety', 'email': 'benety@mongodb.com'}Message: (cherry picked from commit 16538326c83b85c6f900085d612a3286f49b48e7) |
| Comment by Githook User [ 06/Jul/19 ] |
|
Author: {'name': 'Benety Goh', 'username': 'benety', 'email': 'benety@mongodb.com'}Message: (cherry picked from commit db93f75d7db3fbbef85c76238c388ce80e8a6d96) |
| Comment by Githook User [ 06/Jul/19 ] |
|
Author: {'name': 'Benety Goh', 'email': 'benety@mongodb.com', 'username': 'benety'}Message: (cherry picked from commit 1822f7f35f4f86149c81ecbf753957beeebb825a) |
| Comment by Githook User [ 16/Jun/19 ] |
|
Author: {'name': 'Benety Goh', 'email': 'benety@mongodb.com', 'username': 'benety'}Message: This re-applies commit 5bd904dff90a0e6332d6d4630053141e6617c5de with additional |
| Comment by Githook User [ 15/Jun/19 ] |
|
Author: {'name': 'Benety Goh', 'email': 'benety@mongodb.com', 'username': 'benety'}Message: This applies to hybrid index builds for partial indexes. |
| Comment by Githook User [ 14/Jun/19 ] |
|
Author: {'name': 'Benety Goh', 'email': 'benety@mongodb.com', 'username': 'benety'}Message: |
| Comment by Githook User [ 14/Jun/19 ] |
|
Author: {'name': 'Benety Goh', 'email': 'benety@mongodb.com', 'username': 'benety'}Message: |
| Comment by Githook User [ 14/Jun/19 ] |
|
Author: {'name': 'Benety Goh', 'email': 'benety@mongodb.com', 'username': 'benety'}Message: |
| Comment by Benety Goh [ 13/Jun/19 ] |
|
|
| Comment by Benety Goh [ 11/Jun/19 ] |
|
Reverted commit 11f1122708d82b1e499fed6438854d08a55168d2 due to failures in the CI system where the server hit the invariant in IndexBuildsInterceptor::sideWrite(). |
| Comment by Githook User [ 11/Jun/19 ] |
|
Author: {'name': 'Benety Goh', 'email': 'benety@mongodb.com', 'username': 'benety'}Message: Revert " This reverts commit 11f1122708d82b1e499fed6438854d08a55168d2. |
| Comment by Githook User [ 11/Jun/19 ] |
|
Author: {'name': 'Benety Goh', 'email': 'benety@mongodb.com', 'username': 'benety'}Message: This re-applies commit 5bd904dff90a0e6332d6d4630053141e6617c5de with additional |
| Comment by Githook User [ 11/Jun/19 ] |
|
Author: {'name': 'Benety Goh', 'email': 'benety@mongodb.com', 'username': 'benety'}Message: |
| Comment by Githook User [ 11/Jun/19 ] |
|
Author: {'name': 'Benety Goh', 'email': 'benety@mongodb.com', 'username': 'benety'}Message: |
| Comment by Benety Goh [ 10/Jun/19 ] |
|
When adding keys to the index, we check the index filter in IndexCatalogImpl before calling IndexBuildInterceptor::sideWrite(). However, when removing index keys, the filtering is handled by the IndexAccessMethod. To avoid hitting the invariant in the key removal case, it may be sufficient to check the filter for the partial index in IndexCatalogImpl::_unindexRecord() to determine if it is necessary to call IndexBuildInterceptor::sideWrite(). See |
| Comment by Daniel Gottlieb (Inactive) [ 21/Mar/19 ] |
|
The patch I attempted exposed a different problem (see linked BF-12249, though I didn't write up an observation of what that invariant failure means). This ticket is a bug that should be fixed prior to 4.2 being released. Getting the right fix in is tricky/a bit more time consuming because getting the correct multikey + multikeypath state for a set of documents wasn't meant to be accumulated in memory. My understanding of the code is that it expects changes to multikey to be persisted to disk after each document. This has been down prioritized in the short-term to keep focus on the flow control project. It's currently in the 04-08 sprint. I can try to knowledge share if there's a candidate for taking on this work sooner than that. |
| Comment by April Schoffer [ 21/Mar/19 ] |
|
daniel.gottlieb can you give us an update on the status of this ticket? |
| Comment by Githook User [ 22/Feb/19 ] |
|
Author: {'name': 'Daniel Gottlieb', 'email': 'daniel.gottlieb@mongodb.com', 'username': 'dgottlieb'}Message: Revert " This reverts commit 5bd904dff90a0e6332d6d4630053141e6617c5de. |
| Comment by Githook User [ 21/Feb/19 ] |
|
Author: {'name': 'Daniel Gottlieb', 'email': 'daniel.gottlieb@mongodb.com', 'username': 'dgottlieb'}Message: |