[SERVER-5974] distinct returns duplicate values Created: 31/May/12  Updated: 15/Aug/12  Resolved: 15/Jun/12

Status: Closed
Project: Core Server
Component/s: None
Affects Version/s: 2.0.2
Fix Version/s: None

Type: Bug Priority: Major - P3
Reporter: Leonid Evdokimov Assignee: siddharth.singh@10gen.com
Resolution: Duplicate Votes: 0
Labels: None
Remaining Estimate: Not Specified
Time Spent: Not Specified
Original Estimate: Not Specified
Environment:

FreeBSD 8.2, 64-bit


Attachments: File 5974.js    
Issue Links:
Depends
depends on SERVER-6102 Shell displays both 'undefined' and '... Closed
Operating System: ALL
Participants:

 Description   

Here is dump of mongo shell.

  1. some documents have "jclient.targets_list" == []
    > db.mondata.find({"jclient.targets_list": {$exists: true}}).count()
    578244
  2. some documents have no "jclient" at all
    > db.mondata.find({"jclient.targets_list": {$exists: false}}).count()
    3943
  3. it's indexed field
    > db.mondata.getIndexKeys()
    [
    [skipped] { "jclient.targets_list" : 1 }

    ]

  4. and.... here is "distinct"
    > db.mondata.distinct("jclient.targets_list")
    [ null, null ]

It really looks like bug.

Here are some more strange samples:
> db.mondata.distinct("jclient.targets_list", {"jclient.targets_list": {$exists: true}})
[ ]
> db.mondata.distinct("jclient.targets_list", {"jclient.targets_list": {$exists: false}})
[ null ]

  1. so [] + [null] should be [null], should not it?
    > db.mondata.distinct("jclient.targets_list", {$or: [{"jclient.targets_list": {$exists: false}}, {"jclient.targets_list": {$exists: true}}]})
    [ ]


 Comments   
Comment by siddharth.singh@10gen.com [ 15/Jun/12 ]

Hi Leonid,

I am moving forward to closing this one. I created a separate SERVER-6102 to better represent the problem source and project tracking. Much thanks for reporting this to us.

Comment by siddharth.singh@10gen.com [ 12/Jun/12 ]

Hi Leonid,

I was able to reproduce and find the root cause of the issue. Please see the attached test script. When you see [null, null] it is the shell showing two different things as one and so it is confusing.

The shell displays both 'undefined' type and 'null' type as null. Internally the server does differentiate among them and identifies them as two different things. So what comes back from server is actually [undefined, null] which are distinct but shell shows them both as the same.

Undefined vs Null : Note that in the attached script there is an index on users.points. When you run a distinct on users.points, the command tries to be smart and tries to use the index. The 'undefined' values come from records entered on line 5 and 6 of the script, with an empty users.points (users.points = []). The 'null' values come from records entered on line 9 and 10 of the script as they do not appear in the index.

At the end, the distinct set looks like this [undefined, null, 1, 2] but appears as [null, null, 1, 2] in the mongo shell.

Comment by Leonid Evdokimov [ 04/Jun/12 ]

Sure.

> db.mondata.distinct("jclient.targets_list")
[ null, null ]
> db.mondata.find({"jclient.targets_list":{$type:10}}).count()
0
> db.mondata.find({"jclient.targets_list":{$type:6}}).count()
0
> db.mondata.find({"jclient.targets_list":{$type:4}}).count()
0
> db.mondata.find({"jclient.targets_list": {$exists: true}}).count()
501693
> db.mondata.find({"jclient.targets_list": {$exists: false}}).count()
149204
> db.mondata.find({"jclient.targets_list":{$type:2}}).count()
0
> db.mondata.find({"jclient.targets_list":{$type:3}}).count()
0
> db.mondata.find({"jclient.targets_list":{$type:14}}).count()
0

Comment by Scott Hernandez (Inactive) [ 03/Jun/12 ]

Can you run these?

>db.mondata.find({"jclient.targets_list":{$type:10}}).count() // null type
>db.mondata.find({"jclient.targets_list":{$type:6}}).count() // undefined type

Generated at Thu Feb 08 03:10:25 UTC 2024 using Jira 9.7.1#970001-sha1:2222b88b221c4928ef0de3161136cc90c8356a66.