[SERVER-39705] IndexBuildInterceptor does not faithfully preserve multikey when a document generates no keys Created: 21/Feb/19  Updated: 29/Oct/23  Resolved: 16/Jun/19

Status: Closed
Project: Core Server
Component/s: Storage
Affects Version/s: None
Fix Version/s: 4.2.0-rc3, 4.3.1

Type: Bug Priority: Major - P3
Reporter: Daniel Gottlieb (Inactive) Assignee: Benety Goh
Resolution: Fixed Votes: 0
Labels: KS
Remaining Estimate: Not Specified
Time Spent: Not Specified
Original Estimate: Not Specified

Issue Links:
Backports
Depends
Problem/Incident
Related
is related to SERVER-28975 Cannot remove document with 2dsphere ... Closed
is related to SERVER-40825 In-progress hybrid builds should only... Closed
Backwards Compatibility: Fully Compatible
Operating System: ALL
Backport Requested:
v4.2
Sprint: Storage NYC 2019-02-25, Storage NYC 2019-05-20, Execution Team 2019-06-03, Execution Team 2019-06-17
Participants:
Case:
Linked BF Score: 57
Story Points: 8

 Description   

IndexBuildInterceptor makes an incorrect assumption that a document must generate keys to be considered multikey.

In particular, sparse compound indexes may not generate keys, but will consider a document to be multikey[1]. MongoDB's validation code is strict and will compare an index's multikey to the multikey output of every document.

[1] Consider the index {a: 1, b: "2dsphere"} (2dsphere makes an index "auto-sparse"). Consider the document {_id: 1, a: [1,2]}. Because b is omitted, the sparse-ness will result in no index keys being generated. However, because a is an array, that field of the compound index will be considered to be multikey.



 Comments   
Comment by Githook User [ 08/Jul/19 ]

Author:

{'name': 'Benety Goh', 'email': 'benety@mongodb.com', 'username': 'benety'}

Message: SERVER-39705 Always observe multikey changes in IndexBuildInterceptor::sideWrite.

This re-applies commit 5bd904dff90a0e6332d6d4630053141e6617c5de with additional
change to the js test hybrid_sparse_compound_geo_index.js.

(cherry picked from commit 62e3fdae6062cf1fe5e55932eb6aa26f0f593d17)
Branch: v4.2
https://github.com/mongodb/mongo/commit/122835074f1032301a35a322266f80d0a4bb2e1d

Comment by Githook User [ 07/Jul/19 ]

Author:

{'name': 'Benety Goh', 'username': 'benety', 'email': 'benety@mongodb.com'}

Message: SERVER-39705 update write ops check filter before writing to side table

This applies to hybrid index builds for partial indexes.

(cherry picked from commit ea632e0e020c98e39b9c86f4e1f78fba7841e792)
Branch: v4.2
https://github.com/mongodb/mongo/commit/bae9fbb43a0430fae792f9eb5402c6033a1d47b4

Comment by Githook User [ 07/Jul/19 ]

Author:

{'name': 'Benety Goh', 'username': 'benety', 'email': 'benety@mongodb.com'}

Message: SERVER-39705 IndexCatalogImpl::_unindexRecord() checks filter before calling IndexBuildInterceptor::sideWrite()

(cherry picked from commit 17ec6e3bd06770c23090f4287adce13a5301a4d7)
Branch: v4.2
https://github.com/mongodb/mongo/commit/965fad828c1992347630ed2fc79e976ed2b54e60

Comment by Githook User [ 07/Jul/19 ]

Author:

{'name': 'Benety Goh', 'username': 'benety', 'email': 'benety@mongodb.com'}

Message: SERVER-39705 IndexCatalogImpl::_indexKeys() accepts document to be indexed

(cherry picked from commit 4d4892bedbea98ff0bfba3f5d8443ca911877e21)
Branch: v4.2
https://github.com/mongodb/mongo/commit/70141083f13ea9b8b103fada778883212236d2b0

Comment by Githook User [ 06/Jul/19 ]

Author:

{'name': 'Benety Goh', 'username': 'benety', 'email': 'benety@mongodb.com'}

Message: SERVER-39705 add regression tests for hyrid index builds.

(cherry picked from commit 16538326c83b85c6f900085d612a3286f49b48e7)
Branch: v4.2
https://github.com/mongodb/mongo/commit/b7def5792993ee3128690b3bea257f65bb8b159e

Comment by Githook User [ 06/Jul/19 ]

Author:

{'name': 'Benety Goh', 'username': 'benety', 'email': 'benety@mongodb.com'}

Message: SERVER-39705 add js tests for hybrid index builds on sparse and partial geo ndexes

(cherry picked from commit db93f75d7db3fbbef85c76238c388ce80e8a6d96)
Branch: v4.2
https://github.com/mongodb/mongo/commit/5df95783f41ebe74967762788325603730160194

Comment by Githook User [ 06/Jul/19 ]

Author:

{'name': 'Benety Goh', 'email': 'benety@mongodb.com', 'username': 'benety'}

Message: SERVER-39705 add multikey paths to MultikeyPathTracker::mergeMultikeyPaths() invariant message

(cherry picked from commit 1822f7f35f4f86149c81ecbf753957beeebb825a)
Branch: v4.2
https://github.com/mongodb/mongo/commit/8619df155584acfde99c0c1ed0188c33c03129d2

Comment by Githook User [ 16/Jun/19 ]

Author:

{'name': 'Benety Goh', 'email': 'benety@mongodb.com', 'username': 'benety'}

Message: SERVER-39705 Always observe multikey changes in IndexBuildInterceptor::sideWrite.

This re-applies commit 5bd904dff90a0e6332d6d4630053141e6617c5de with additional
change to the js test hybrid_sparse_compound_geo_index.js.
Branch: master
https://github.com/mongodb/mongo/commit/62e3fdae6062cf1fe5e55932eb6aa26f0f593d17

Comment by Githook User [ 15/Jun/19 ]

Author:

{'name': 'Benety Goh', 'email': 'benety@mongodb.com', 'username': 'benety'}

Message: SERVER-39705 update write ops check filter before writing to side table

This applies to hybrid index builds for partial indexes.
Branch: master
https://github.com/mongodb/mongo/commit/ea632e0e020c98e39b9c86f4e1f78fba7841e792

Comment by Githook User [ 14/Jun/19 ]

Author:

{'name': 'Benety Goh', 'email': 'benety@mongodb.com', 'username': 'benety'}

Message: SERVER-39705 IndexCatalogImpl::_indexKeys() accepts document to be indexed
Branch: master
https://github.com/mongodb/mongo/commit/4d4892bedbea98ff0bfba3f5d8443ca911877e21

Comment by Githook User [ 14/Jun/19 ]

Author:

{'name': 'Benety Goh', 'email': 'benety@mongodb.com', 'username': 'benety'}

Message: SERVER-39705 add multikey paths to MultikeyPathTracker::mergeMultikeyPaths() invariant message
Branch: master
https://github.com/mongodb/mongo/commit/1822f7f35f4f86149c81ecbf753957beeebb825a

Comment by Githook User [ 14/Jun/19 ]

Author:

{'name': 'Benety Goh', 'email': 'benety@mongodb.com', 'username': 'benety'}

Message: SERVER-39705 add regression tests for hyrid index builds.
Branch: master
https://github.com/mongodb/mongo/commit/16538326c83b85c6f900085d612a3286f49b48e7

Comment by Benety Goh [ 13/Jun/19 ]

SERVER-40825 fixed some of the CI failures that led the to the commit 11f1122708d82b1e499fed6438854d08a55168d2 being reverted. This condition added to IndexBuildInterceptor::sideWrite() would have skipped the multikey checks for unindexing operations that were problematic for the side table updates,

Comment by Benety Goh [ 11/Jun/19 ]

Reverted commit 11f1122708d82b1e499fed6438854d08a55168d2 due to failures in the CI system where the server hit the invariant in IndexBuildsInterceptor::sideWrite().

Comment by Githook User [ 11/Jun/19 ]

Author:

{'name': 'Benety Goh', 'email': 'benety@mongodb.com', 'username': 'benety'}

Message: Revert "SERVER-39705 Always observe multikey changes in IndexBuildInterceptor::sideWrite."

This reverts commit 11f1122708d82b1e499fed6438854d08a55168d2.
Branch: master
https://github.com/mongodb/mongo/commit/3bf31bf4a07c31365a4d9fd92e380bda0509e842

Comment by Githook User [ 11/Jun/19 ]

Author:

{'name': 'Benety Goh', 'email': 'benety@mongodb.com', 'username': 'benety'}

Message: SERVER-39705 Always observe multikey changes in IndexBuildInterceptor::sideWrite.

This re-applies commit 5bd904dff90a0e6332d6d4630053141e6617c5de with additional
change to the js test hybrid_sparse_multikey_index.js.
Branch: master
https://github.com/mongodb/mongo/commit/11f1122708d82b1e499fed6438854d08a55168d2

Comment by Githook User [ 11/Jun/19 ]

Author:

{'name': 'Benety Goh', 'email': 'benety@mongodb.com', 'username': 'benety'}

Message: SERVER-39705 IndexCatalogImpl::_unindexRecord() checks filter before calling IndexBuildInterceptor::sideWrite()
Branch: master
https://github.com/mongodb/mongo/commit/17ec6e3bd06770c23090f4287adce13a5301a4d7

Comment by Githook User [ 11/Jun/19 ]

Author:

{'name': 'Benety Goh', 'email': 'benety@mongodb.com', 'username': 'benety'}

Message: SERVER-39705 add js tests for hybrid index builds on sparse and partial geo ndexes
Branch: master
https://github.com/mongodb/mongo/commit/db93f75d7db3fbbef85c76238c388ce80e8a6d96

Comment by Benety Goh [ 10/Jun/19 ]

When adding keys to the index, we check the index filter in IndexCatalogImpl before calling IndexBuildInterceptor::sideWrite().

However, when removing index keys, the filtering is handled by the IndexAccessMethod.

To avoid hitting the invariant in the key removal case, it may be sufficient to check the filter for the partial index in IndexCatalogImpl::_unindexRecord() to determine if it is necessary to call IndexBuildInterceptor::sideWrite().

See SERVER-28975.

Comment by Daniel Gottlieb (Inactive) [ 21/Mar/19 ]

The patch I attempted exposed a different problem (see linked BF-12249, though I didn't write up an observation of what that invariant failure means). This ticket is a bug that should be fixed prior to 4.2 being released. Getting the right fix in is tricky/a bit more time consuming because getting the correct multikey + multikeypath state for a set of documents wasn't meant to be accumulated in memory. My understanding of the code is that it expects changes to multikey to be persisted to disk after each document.

This has been down prioritized in the short-term to keep focus on the flow control project. It's currently in the 04-08 sprint. I can try to knowledge share if there's a candidate for taking on this work sooner than that.

Comment by April Schoffer [ 21/Mar/19 ]

daniel.gottlieb can you give us an update on the status of this ticket?

Comment by Githook User [ 22/Feb/19 ]

Author:

{'name': 'Daniel Gottlieb', 'email': 'daniel.gottlieb@mongodb.com', 'username': 'dgottlieb'}

Message: Revert "SERVER-39705: Always observe multikey changes in IndexBuildInterceptor::sideWrite."

This reverts commit 5bd904dff90a0e6332d6d4630053141e6617c5de.
Branch: master
https://github.com/mongodb/mongo/commit/c5b27715bff29b0ea7ed01b613bcd47c8882361e

Comment by Githook User [ 21/Feb/19 ]

Author:

{'name': 'Daniel Gottlieb', 'email': 'daniel.gottlieb@mongodb.com', 'username': 'dgottlieb'}

Message: SERVER-39705: Always observe multikey changes in IndexBuildInterceptor::sideWrite.
Branch: master
https://github.com/mongodb/mongo/commit/5bd904dff90a0e6332d6d4630053141e6617c5de

Generated at Thu Feb 08 04:52:52 UTC 2024 using Jira 9.7.1#970001-sha1:2222b88b221c4928ef0de3161136cc90c8356a66.