[SERVER-16762] Loss of type information for covered queries in WiredTiger Created: 07/Jan/15  Updated: 24/Jan/15  Resolved: 07/Jan/15

Status: Closed
Project: Core Server
Component/s: Index Maintenance, Storage
Affects Version/s: None
Fix Version/s: None

Type: Bug Priority: Major - P3
Reporter: Tyler Brock Assignee: Mathias Stearn
Resolution: Duplicate Votes: 0
Labels: None
Remaining Estimate: Not Specified
Time Spent: Not Specified
Original Estimate: Not Specified

Issue Links:
Duplicate
duplicates SERVER-16632 Change WiredTiger index key format to... Closed
Related
related to SERVER-16916 RecordId might be inserted more than ... Closed
related to CXX-463 Figure out why WiredTiger fails our b... Closed
Backwards Compatibility: Fully Compatible
Operating System: ALL
Steps To Reproduce:

#include <iostream>
#include <vector>
#include "mongo/client/dbclient.h"
 
using namespace std;
using namespace mongo;
 
int main() {
    client::initialize();
    DBClientConnection c;
    c.connect("localhost");
 
    c.dropDatabase("test");
    c.createIndex("test.wow", IndexSpec().addKeys(BSON("a" << 1)));
 
    BSONObj toInsert = BSON("a" << 1);
    cout << "type in: " << toInsert.getField("a").type() << endl;
    c.insert("test.wow", toInsert);
 
    BSONObj distinct;
    c.runCommand("test", BSON(
        "distinct" << "wow" <<
        "key" << "a"
    ), distinct);
 
    vector<BSONElement> result = distinct.getField("values").Array();
    cout << "type out: " << result[0].type() << endl;
}

$ g++ test.cpp -I ./include -L ./lib -lmongoclient -lboost_system -lboost_regex -lboost_thread
$ ./a.out                                                                                     
type in: 16
type out: 1

Participants:

 Description   

I'm not sure if this is by design but I have observed this behavior change between WiredTiger and MMAPv1.

Using MMAP the covered distinct result has the same bson type as the data that was inserted. However when using WiredTiger the distinct result has type double instead of Int32.

I realize that this is due to the new index key format for WiredTiger but I was surprised that distinct would product this result.



 Comments   
Comment by Githook User [ 12/Jan/15 ]

Author:

{u'username': u'RedBeard0531', u'name': u'Mathias Stearn', u'email': u'mathias@10gen.com'}

Message: SERVER-16632 Encode TypeBits along with KeyStrings in WT indexes

This allows covered queries to recover the original data, without collapsing
cases that compare equally (such as 1 and 1.0, or 0.0 and -0.0). However, no
attempt is made to preserve specific versions of NaN, they all become
quiet_NaN().

This resolves this issue discussed in SERVER-16762.
Branch: master
https://github.com/mongodb/mongo/commit/a9cacdf8762ed197b1de1fd5d4ff2b47692e5a92

Comment by Daniel Pasette (Inactive) [ 07/Jan/15 ]

duplicate of SERVER-16632. work for this will hang off that ticket.

Comment by J Rassi [ 07/Jan/15 ]

I can also reproduce this with a normal covered query.

With mmapv1, type information is preserved:

>>> coll = MongoClient()['test']['foo']
>>> coll.drop()
>>> coll.insert({'a':1})
ObjectId('54ad7c2cd0deab4ccc775c73')
>>> list(coll.find({'a':1},{'a':1,'_id':0}))
[{u'a': 1}]
>>> coll.ensure_index('a')
u'a_1'
>>> list(coll.find({'a':1},{'a':1,'_id':0}))
[{u'a': 1}] # EXPECTED.

With WiredTiger, type information is lost:

>>> coll = MongoClient()['test']['foo']
>>> coll.drop()
>>> coll.insert({'a':1})
ObjectId('54ad7c4fd0deab4ccc775c74')
>>> list(coll.find({'a':1},{'a':1,'_id':0}))
[{u'a': 1}]
>>> coll.ensure_index('a')
u'a_1'
>>> list(coll.find({'a':1},{'a':1,'_id':0}))
[{u'a': 1.0}] # NOT EXPECTED.

cc redbeard0531 dan@10gen.com david.storch

Generated at Thu Feb 08 03:42:11 UTC 2024 using Jira 9.7.1#970001-sha1:2222b88b221c4928ef0de3161136cc90c8356a66.