[SERVER-44482] Value duplicates fields in the cache when used in an exclusion projection Created: 07/Nov/19  Updated: 27/Oct/23  Resolved: 13/Nov/19

Status: Closed
Project: Core Server
Component/s: None
Affects Version/s: None
Fix Version/s: None

Type: Bug Priority: Major - P3
Reporter: Ted Tuckman Assignee: David Storch
Resolution: Gone away Votes: 0
Labels: afz, qexec-team
Remaining Estimate: Not Specified
Time Spent: Not Specified
Original Estimate: Not Specified

Issue Links:
Depends
Related
related to SERVER-44487 Only pull looked-up fields into Docum... Closed
Operating System: ALL
Sprint: Query 2019-11-18
Participants:
Linked BF Score: 0

 Description   

In exclusion projection we loop over the BSONObj in the Value twice, and the BSONObjIterator is reset between the loops. Therefore each field in the object appears twice in the cache.



 Comments   
Comment by David Storch [ 13/Nov/19 ]

I have confirmed that this was fixed as a result of the work under SERVER-44487, so I am closing this as Gone Away. I checked by running the following:

MongoDB Enterprise > db.c.drop()
true
MongoDB Enterprise > db.c.insert({a: 1, b: 1, c: 1, d: 1})
WriteResult({ "nInserted" : 1 })
MongoDB Enterprise > db.c.aggregate([{$project: {d: 0}}, {$project: {e: 0}}])
{ "_id" : ObjectId("5dcc704907622ccc2870a8f8"), "a" : 1, "b" : 1, "c" : 1 }

I also added debug logging which includes the number of entries in the DocumentStorage cache whenever a new cache entry is created:

diff --git a/src/mongo/db/exec/document_value/document.cpp b/src/mongo/db/exec/document_value/document.cpp
index 17c18ba3c7..322168ff52 100644
--- a/src/mongo/db/exec/document_value/document.cpp
+++ b/src/mongo/db/exec/document_value/document.cpp
@@ -177,6 +177,10 @@ Position DocumentStorage::constructInCache(const BSONElement& elem) {
     appendField(fieldName, ValueElement::Kind::kCached) = Value(elem);
     _modified = savedModified;
 
+    std::cout << "[dstorch] constructInCache() for doc storage "
+              << reinterpret_cast<uintptr_t>(this) << " field: " << elem.fieldNameStringData()
+              << " numFields: " << _numFields << std::endl;
+
     return pos;
 }

Before the fix, this query resulted in 10 cache entries in a single document:

[dstorch] constructInCache() for doc storage 94492680159104 field: _id numFields: 1
[dstorch] constructInCache() for doc storage 94492680159104 field: a numFields: 2
[dstorch] constructInCache() for doc storage 94492680159104 field: b numFields: 3
[dstorch] constructInCache() for doc storage 94492680159104 field: c numFields: 4
[dstorch] constructInCache() for doc storage 94492680159104 field: d numFields: 5
[dstorch] constructInCache() for doc storage 94492680158848 field: _id numFields: 6
[dstorch] constructInCache() for doc storage 94492680158848 field: a numFields: 7
[dstorch] constructInCache() for doc storage 94492680158848 field: b numFields: 8
[dstorch] constructInCache() for doc storage 94492680158848 field: c numFields: 9
[dstorch] constructInCache() for doc storage 94492680158848 field: d numFields: 10

After the fix, we see just 5 fields:

[dstorch] constructInCache() for doc storage 139700113381760 field: _id numFields: 1
[dstorch] constructInCache() for doc storage 139700113381760 field: a numFields: 2
[dstorch] constructInCache() for doc storage 139700113381760 field: b numFields: 3
[dstorch] constructInCache() for doc storage 139700113381760 field: c numFields: 4
[dstorch] constructInCache() for doc storage 139700113381760 field: d numFields: 5

Comment by David Storch [ 11/Nov/19 ]

ted.tuckman can you provide a more detailed description of the issue? Do you think it will be fixed by SERVER-44487?

Generated at Thu Feb 08 05:06:06 UTC 2024 using Jira 9.7.1#970001-sha1:2222b88b221c4928ef0de3161136cc90c8356a66.