[SERVER-10435] Compound index on hashmap field results in inconsistent count Created: 05/Aug/13  Updated: 29/Apr/15  Resolved: 29/Apr/15

Status: Closed
Project: Core Server
Component/s: Index Maintenance
Affects Version/s: 2.2.1, 2.4.5
Fix Version/s: None

Type: Bug Priority: Major - P3
Reporter: arun j Assignee: David Hows
Resolution: Done Votes: 0
Labels: None
Remaining Estimate: Not Specified
Time Spent: Not Specified
Original Estimate: Not Specified
Environment:

Windows 7, CentOS, Mac OS


Attachments: Text File mongodb.log    
Operating System: ALL
Steps To Reproduce:

The repro steps are as per the above description

Participants:

 Description   

I have a document with a field "ia", it's a HashMap (with key value pair) in my java entity.

The mongo document PlayerAttribute looks like this:

{
    "_id" : ObjectId("51f8c4cbcd757bd40da8a0c6"),
    "gameId" : "XYZ",
    "ia" : {
        "CrashesCount" : "0",
        "HardCurrencyOwned" : "7898",
        "LastNotificationsId" : "12",
        "Level" : "11",
        "PvpRacesPlayed" : "0",
        "PvpRaiting_FreeStyle" : "1200",
        "PvpRaiting_Motocross" : "1200",
        "PvpRaiting_SpeedAndStyle" : "1200",
        "PvpRaiting_StepUp" : "1200",
        "PvpRank" : "0"
    },
    "lastSessionExpireTime" : ISODate("2013-07-31T12:31:33.966Z"),
    "lastSessionStartTime" : ISODate("2013-07-31T12:31:27.975Z"),
    "name" : "abc123",
    type : null
}

This is the index definition for the collection,

[
    {
            "v" : 1,
            "key" : {
                    "_id" : 1
            },
            "ns" : "player.PlayerAttribute",
            "name" : "_id_"
    },
    {
            "v" : 1,
            "key" : {
                    "effectiveDate" : 1
            },
            "ns" : "player.PlayerAttribute",
            "name" : "effectiveDate",
            "dropDups" : false,
            "sparse" : false
    },
    {
            "v" : 1,
            "key" : {
                    "endDate" : 1
            },
            "ns" : "player.PlayerAttribute",
            "name" : "endDate",
            "dropDups" : false,
            "sparse" : false
    },
    {
            "v" : 1,
            "key" : {
                    "gameId" : 1,
                    "name" : 1
            },
            "unique" : true,
            "ns" : "player.PlayerAttribute",
            "name" : "pa_gid_n_unique_idx"
    },
    {
            "v" : 1,
            "key" : {
                    "gameId" : 1,
                    "ia" : 1,
                    "name" : 1
            },
            "ns" : "player.PlayerAttribute",
            "name" : "pa_gid_iattr_n_idx"
    },
    {
            "v" : 1,
            "key" : {
                    "gameId" : 1,
                    "lastSessionExpireTime" : 1
            },
            "ns" : "player.PlayerAttribute",
            "name" : "pa_gid_lset_idx"
    }
]

This is the query I used from mongo shell,

db.PlayerAttribute.find({"gameId":"XYZ", "attributes.CrashesCount" : "0"}).count();

First time it gives me the result greater than zero (but that is incorrect. For eg., if there are 7 docs match the query, it would say 6) and the next subsequent times it gives 0 and this happens for 5 to 10 minutes.

After about 5 to 10 minutes, it gives the non zero number again (one time) and next few times it is zero.

When I deleted this index ("name" : "pa_gid_iattr_n_idx"), it gives correct count all the time.

Any idea why this index on hashmap field ("name" : "pa_gid_iattr_n_idx") makes mongo to produce incorrect and inconsistent result?

Thanks in advance for the response.



 Comments   
Comment by David Hows [ 15/Aug/13 ]

Hi Arun,

That was my hypothesis and why I wanted to have a look at a few example documents (to gauge size) and see the explain output.

If you are only interested in specific sub elements that are within the "ia" subdocument why not index those directly? There are some examples in our indexing documentation

For the query you have shown me you could use something like this:

db.PlayerAttribute.ensureIndex({"gameId" : 1, "ia.Regsitered":1, "name":1})

Regards,
David

Comment by arun j [ 13/Aug/13 ]

Hi David,
I did some investigation myself and this is what I found. Since the compound index on "gameId" + "ia" (which is essentially HashMap in java entity) + name fields (index name: "pa_gid_iattr_n_idx") was the root cause of the issue, I focussed more on this index. Interestingly what I found was, when the "ia" hashmap grows in size (which happens when I add more key+value pair into the hashmap) and reaches larger than the index size limit (1024 bytes), it can no longer keep it in the index anymore. So when I do the query (which apparently uses this index) everytime it returns no documents, even though when I do explain the "n" field says 6 (but still this value is wrong because the actual # of documents that matched the query was 7). When I shrink the ia field to less than 1024 bytes, it works as expected.

Since the query uses the ia field, do you have any recommendations on how to handle this query efficiently?

Thanks
Arun

Comment by David Hows [ 13/Aug/13 ]

Hi Arun,

That looks like a big difference, were you able to run those with the .explain() instead of the count so that wen can see if there are any obvious differences in the results from those two queries?

Would it be possible for you to find those 6 matching documents and post them to this ticket?

You should be able to find them by using a "hint" on another index, or temporarily dropping the bad index as you mentioned earlier.

Regards,
David

Comment by arun j [ 12/Aug/13 ]

Hi David, I've attached the mongodb log as per your request. Please look at line # 969 (see below) that returns the count > 0, nreturned is 6, this is the first time when I run the query,

Mon Aug 12 11:15:27.879 [conn1] query player.PlayerAttribute query:

{ gameId: "com.glu.stuntracing", ia.Registered: "true" }

ntoreturn:0 ntoskip:0 nscanned:94 keyUpdates:0 locks(micros) W:13289 r:15879 nreturned:6 reslen:10125 16ms

And executing the same query again returns count 0 always, please see below from the log, nreturned is 0,

Mon Aug 12 11:15:41.802 [conn1] query player.PlayerAttribute query:

{ gameId: "com.glu.stuntracing", ia.Registered: "true" }

ntoreturn:0 ntoskip:0 nscanned:23 keyUpdates:0 locks(micros) r:4192 nreturned:0 reslen:20 4ms

Comment by arun j [ 12/Aug/13 ]

Please look at line # 969 (see below) that returns the count > 0, nreturned is 6,

Mon Aug 12 11:15:27.879 [conn1] query player.PlayerAttribute query:

{ gameId: "com.glu.stuntracing", ia.Registered: "true" }

ntoreturn:0 ntoskip:0 nscanned:94 keyUpdates:0 locks(micros) W:13289 r:15879 nreturned:6 reslen:10125 16ms

And executing the same query again returns count 0 always, please see below from the log, nreturned is 0,

Mon Aug 12 11:15:41.802 [conn1] query player.PlayerAttribute query:

{ gameId: "com.glu.stuntracing", ia.Registered: "true" }

ntoreturn:0 ntoskip:0 nscanned:23 keyUpdates:0 locks(micros) r:4192 nreturned:0 reslen:20 4ms

Comment by arun j [ 12/Aug/13 ]

Hi Stephen,
Thanks for your response. The MongoDB version is 2.4.5 and OS is windows 7 64 bit.

Thanks for pointing out the query inconsistency, below is the actual query that has the issue,

db.PlayerAttribute.find(

{"gameId":"XYZ", "ia.CrashesCount" : "0"}

).count();

And I am not doing any numeric range comparision in my query, all I am doing is a string match.

Let me know if you need any additional information.

Thanks
Arun

Comment by Stennie Steneker (Inactive) [ 12/Aug/13 ]

Hi Arun,

You've listed two versions of MongoDB (2.2.1, 2.4.5) and three O/S (Windows 7, CentOS, Mac OS) in the details for this issue.

Can you please be specific on which combination of MongoDB and O/S this issue can be reproduced with?

I would also note that:

  • your query is on "attributes.CrashesCount" but the field in the example document is "ia.CrashesCount"
  • your CrashesCount appears to be stored as a string ("0") rather than an integer (0) so the $gt may not be doing what you expect; if you want a numeric comparison you probably want to cast those values in your hashmap to an (Int) or (Number)

If you can confirm the environment versions and provide the additional information requested by David we can investigate further.

Thanks,
Stephen

Comment by David Hows [ 12/Aug/13 ]

Hi Arun,

How big do those hashes get?

Can you attach the logs from the node in question? I suspect that there may be issues with the index in question, such as problems creating or updating entries.

Can you please run the command with a .explain() instead of a .count()? That will tell us what is going on with the index selection and return results of the queries.

Regards,
David

Comment by arun j [ 09/Aug/13 ]

Any thoughts on this issue?

Generated at Thu Feb 08 03:23:10 UTC 2024 using Jira 9.7.1#970001-sha1:2222b88b221c4928ef0de3161136cc90c8356a66.