[SERVER-16704] 2dsphere index appears to allow indexing of parallel arrays Created: 01/Jan/15  Updated: 24/Jan/15  Resolved: 23/Jan/15

Status: Closed
Project: Core Server
Component/s: Geo, Index Maintenance
Affects Version/s: 2.6.6
Fix Version/s: None

Type: Bug Priority: Major - P3
Reporter: Paul Bryan Assignee: David Storch
Resolution: Done Votes: 0
Labels: None
Remaining Estimate: Not Specified
Time Spent: Not Specified
Original Estimate: Not Specified

Issue Links:
Related
related to SERVER-826 Allow indexing of several arrays Backlog
Backwards Compatibility: Fully Compatible
Operating System: ALL
Steps To Reproduce:

----- OPERATING AS EXPECTED:

db.t.drop()
 
db.t.ensureIndex({
    "keywords": 1,
    "hashtags": 1
})
 
db.t.insert({
    "_id": 1,
    "text": "This is a test of the emergency broadcast system. #fail #zoolander #yahoo",
    "keywords": [ "test", "emergency", "broadcast", "system", "fail", "zoolander", "yahoo" ],
    "hashtags": [ "fail", "zoolander", "yahoo" ],
    "language": "en",
    "geojson": {
        "type": "Point",
        "coordinates": [ -123.14489, 49.30452 ]
    },
})

----- OPERATING IN A SURPRISING WAY:

db.t.drop()
 
db.t.ensureIndex({
    "geojson": "2dsphere", // this is the only difference
    "keywords": 1,
    "hashtags": 1
})
 
db.t.insert({
    "_id": 1,
    "text": "This is a test of the emergency broadcast system. #fail #zoolander #yahoo",
    "keywords": [ "test", "emergency", "broadcast", "system", "fail", "zoolander", "yahoo" ],
    "hashtags": [ "fail", "zoolander", "yahoo" ],
    "language": "en",
    "geojson": {
        "type": "Point",
        "coordinates": [ -123.14489, 49.30452 ]
    },
})

Participants:

 Description   

If I try to index two fields that are arrays without the presence of a 2dsphere index, when I try to insert a document, I predictably get the errmsg "insertDocument :: caused by :: 10088 cannot index parallel arrays [field1] [field2]".

If I create the index with a 2dsphere in it, it allows such insertions, and it appears that it even uses the index in the query plan!



 Comments   
Comment by David Storch [ 23/Jan/15 ]

Hi pbryan,

Thanks for the report, and apologies for the delay in tracking this down.

After investigating, I've determined that this is expected behavior. The index key generation logic for 2dsphere indices is designed to compute the Cartesian product for parallel indexed arrays. If you are at all curious about the key generation implementation, you can refer to the code here.

Note that an insert will generate a warning if the number of keys to be inserted into the 2dsphere index exceeds a parameter called maxKeysPerInsert. The value of maxKeysPerInsert is currently set to 200, and is not configurable.

It is the case that regular (non-geo) btree indices do not currently allow indexing of parallel arrays. An insert will fail if it would require the btree key generation logic to take a Cartesian product. SERVER-826 is an open feature request to allow indexing of parallel arrays for regular btree indices. Please watch that ticket for updates.

Best,
Dave

Comment by Daniel Pasette (Inactive) [ 02/Jan/15 ]

Thanks for the report Paul.

Generated at Thu Feb 08 03:41:59 UTC 2024 using Jira 9.7.1#970001-sha1:2222b88b221c4928ef0de3161136cc90c8356a66.