[SERVER-44571] Documents involved in SERVER-44050 corruption scenario cannot be updated or deleted after upgrade Created: 12/Nov/19  Updated: 29/Oct/23  Resolved: 19/Nov/19

Status: Closed
Project: Core Server
Component/s: Querying
Affects Version/s: 3.6.15
Fix Version/s: 3.6.16, 3.4.24, 4.2.2, 4.0.14, 4.3.2

Type: Bug Priority: Critical - P2
Reporter: David Storch Assignee: Arun Banala
Resolution: Fixed Votes: 0
Labels: qexec-team
Remaining Estimate: Not Specified
Time Spent: Not Specified
Original Estimate: Not Specified

Issue Links:
Backports
Related
is related to SERVER-44050 Arrays along 'hashed' index key path ... Closed
Backwards Compatibility: Fully Compatible
Operating System: ALL
Backport Requested:
v4.2, v4.0, v3.6, v3.4
Sprint: Query 2019-11-18, Query 2019-12-02
Participants:

 Description   

SERVER-44050 describes an issue in which the server incorrectly permits an array along a hash-indexed path. This leads to an index with incorrect index keys, which in turn can result in queries missing results. This issue has been fixed in our stable release branches. However, users may have existing clusters which still contain corrupt indexes that need to be recovered.

Ideally, users would be able to recover from SERVER-44050 by either deleting any illegal documents, or by updating these documents so that they do not contain an array along the hash-indexed path. However, any attempt to update or delete these documents will be rejected by a server that contains the fix for SERVER-44050. Not only does this make it more difficult to recover from SERVER-44050, but it is a problem in its own right that a collection can contain documents which cannot be updated or deleted.

As an example, suppose the following was executed on a server version that does not contain the fix for SERVER-44050:

> db.c.drop()
true
> db.c.createIndex({"a.b": "hashed"})
{
    "createdCollectionAutomatically" : true,
    "numIndexesBefore" : 1,
    "numIndexesAfter" : 2,
    "ok" : 1
}
> db.c.insert({a: [{b: 1}]})
WriteResult({ "nInserted" : 1 })

The user then updates the server to a version that contains the fix for SERVER-44050, and wishes to recover their corrupt node. In order to do so, they try to correct the document with an update/delete, but all such attempts fail and the index remains corrupted:

> db.c.find()
{ "_id" : ObjectId("5dc9f3c493197d531866f3f7"), "a" : [ { "b" : 1 } ] }
> db.c.remove({_id: ObjectId("5dc9f3c493197d531866f3f7")})
WriteResult({
	"nRemoved" : 0,
	"writeError" : {
		"code" : 16766,
		"errmsg" : "Error: hashed indexes do not currently support array values. Found array at path: a"
	}
})

Due to this limitation, users can only recover by first dropping the index, then cleaning up any documents with arrays on the hash-indexed path, and finally rebuilding the hashed index. These recovery steps could be operationally cumbersome, especially if the index is necessary to support hashed sharding.



 Comments   
Comment by Githook User [ 20/Nov/19 ]

Author:

{'name': 'Arun Banala', 'username': 'banarun', 'email': 'arun.banala@10gen.com'}

Message: SERVER-44571 Documents involved in SERVER-44050 corruption scenario cannot be updated or deleted after upgrade

(cherry picked from commit 35c6778143fc55eb9617ab4a54e616ba1e537ad5)
(cherry picked from commit 6dd33f3f725d8df801603b8f1dcbd7b13a85f1ce)
(cherry picked from commit 30701d77ca133ecd1b184587f61c832b1d028a4b)
(cherry picked from commit bb379d5efd111b8d2221c94c857f93f8aff3bc1d)
Branch: v3.4
https://github.com/mongodb/mongo/commit/2c1ef409e145a8adfb845415e17033de7ed170d7

Comment by Githook User [ 20/Nov/19 ]

Author:

{'name': 'Arun Banala', 'username': 'banarun', 'email': 'arun.banala@10gen.com'}

Message: SERVER-44571 Documents involved in SERVER-44050 corruption scenario cannot be updated or deleted after upgrade

(cherry picked from commit 35c6778143fc55eb9617ab4a54e616ba1e537ad5)
(cherry picked from commit 6dd33f3f725d8df801603b8f1dcbd7b13a85f1ce)
(cherry picked from commit 30701d77ca133ecd1b184587f61c832b1d028a4b)
Branch: v3.6
https://github.com/mongodb/mongo/commit/7a3bdd35b859ac8462e756b909c0a4773195b99a

Comment by Githook User [ 20/Nov/19 ]

Author:

{'name': 'Arun Banala', 'username': 'banarun', 'email': 'arun.banala@10gen.com'}

Message: SERVER-44571 Documents involved in SERVER-44050 corruption scenario cannot be updated or deleted after upgrade

(cherry picked from commit 35c6778143fc55eb9617ab4a54e616ba1e537ad5)
(cherry picked from commit 6dd33f3f725d8df801603b8f1dcbd7b13a85f1ce)
Branch: v4.0
https://github.com/mongodb/mongo/commit/12dbb497bc724f73f1fc9e032d7f191225574437

Comment by Githook User [ 20/Nov/19 ]

Author:

{'name': 'Arun Banala', 'username': 'banarun', 'email': 'arun.banala@10gen.com'}

Message: SERVER-44571 Documents involved in SERVER-44050 corruption scenario cannot be updated or deleted after upgrade

(cherry picked from commit 35c6778143fc55eb9617ab4a54e616ba1e537ad5)
Branch: v4.2
https://github.com/mongodb/mongo/commit/5c5d42a0dbab1d3c83a432710988baa3568255a8

Comment by Githook User [ 19/Nov/19 ]

Author:

{'name': 'Arun Banala', 'username': 'banarun', 'email': 'arun.banala@10gen.com'}

Message: SERVER-44571 Documents involved in SERVER-44050 corruption scenario cannot be updated or deleted after upgrade
Branch: master
https://github.com/mongodb/mongo/commit/88956acd276472b1b4c0192f73b07d67c4cae29c

Comment by David Storch [ 12/Nov/19 ]

One further complication is that the behavior is slightly different (and arguably worse) on the 4.2 branch. I used 4.2.0 to produce a corrupt hashed index per SERVER-44050, similarly to the example in the description of this ticket. Then, with a local build of HEAD of the v4.2 branch (since 4.2.2 is not yet released):

> db.c.find()
{ "_id" : ObjectId("5dc9f4b5bf7c1d85f8b9e849"), "a" : [ { "b" : 1 } ] }
> db.c.find({"a.b": 1})
// This returns nothing due to the corrupt hashed index.
> db.c.remove({_id: ObjectId("5dc9f4b5bf7c1d85f8b9e849")})
WriteResult({ "nRemoved" : 1 })
> db.c.find().hint({"a.b": "hashed"}).returnKey()
{ "a.b" : NumberLong("2338878944348059895") } // Although the document was removed, its index key was not removed.

This shows that in 4.2, documents with an illegal array along the hashed path can be deleted. However, the index maintenance code does not clean up the incorrect index keys associated with such documents. The index therefore remains in a bad state even after the bad documents have been cleaned up. This situation can be remedied by rebuilding the index. However, it would be better to allow users to recover from SERVER-44050 without requiring index rebuilds.

Comment by David Storch [ 12/Nov/19 ]

I only marked this as affecting 3.6.15, since it looks like that is the only version that has actually been released as of this writing that contains the fix for SERVER-44050.

Generated at Thu Feb 08 05:06:21 UTC 2024 using Jira 9.7.1#970001-sha1:2222b88b221c4928ef0de3161136cc90c8356a66.