[SERVER-31318] Unique 2dsphere multikey indexes behave differently than non-2dsphere counterparts Created: 29/Sep/17  Updated: 27/Dec/23

Status: Backlog
Project: Core Server
Component/s: Index Maintenance, Querying
Affects Version/s: None
Fix Version/s: None

Type: Bug Priority: Minor - P4
Reporter: Max Hirschhorn Assignee: Backlog - Query Integration
Resolution: Unresolved Votes: 0
Labels: qi-geo, query-44-grooming
Remaining Estimate: Not Specified
Time Spent: Not Specified
Original Estimate: Not Specified

Assigned Teams:
Query Integration
Operating System: ALL
Steps To Reproduce:

python buildscripts/resmoke.py unique_2dsphere_multikey_index.js

unique_2dsphere_multikey_index.js

(function() {
    "use strict";
 
    assert.writeOK(db.mycoll.insert({x: [{a: 1, b: 1}, {a: 2, b: 2}], geo: [0, 0]}));
    assert.writeOK(db.mycoll.insert({x: [{a: 1, b: 2}, {a: 2, b: 1}], geo: [0, 0]}));
 
    assert.commandWorked(db.mycoll.createIndex({"x.a": 1, "x.b": 1}, {unique: true}));
    assert.commandWorked(
        db.mycoll.createIndex({"x.a": 1, "x.b": 1, geo: "2dsphere"}, {unique: true}));
})();

[js_test:unique_2dsphere_multikey_index] 2017-09-28T20:37:37.940-0400 2017-09-28T20:37:37.937-0400 E QUERY    [thread1] Error: command failed: {
[js_test:unique_2dsphere_multikey_index] 2017-09-28T20:37:37.940-0400 	"ok" : 0,
[js_test:unique_2dsphere_multikey_index] 2017-09-28T20:37:37.940-0400 	"errmsg" : "E11000 duplicate key error collection: test.mycoll index: x.a_1_x.b_1_geo_2dsphere dup key: { : 1.0, : 1.0, : 1152921504606846977 }",
[js_test:unique_2dsphere_multikey_index] 2017-09-28T20:37:37.940-0400 	"code" : 11000,
[js_test:unique_2dsphere_multikey_index] 2017-09-28T20:37:37.940-0400 	"codeName" : "DuplicateKey"
[js_test:unique_2dsphere_multikey_index] 2017-09-28T20:37:37.941-0400 } : undefined :
[js_test:unique_2dsphere_multikey_index] 2017-09-28T20:37:37.941-0400 _getErrorWithCode@src/mongo/shell/utils.js:25:13
[js_test:unique_2dsphere_multikey_index] 2017-09-28T20:37:37.941-0400 doassert@src/mongo/shell/assert.js:16:14
[js_test:unique_2dsphere_multikey_index] 2017-09-28T20:37:37.941-0400 assert.commandWorked@src/mongo/shell/assert.js:403:5
[js_test:unique_2dsphere_multikey_index] 2017-09-28T20:37:37.941-0400 @unique_2dsphere_multikey_index.js:8:1
[js_test:unique_2dsphere_multikey_index] 2017-09-28T20:37:37.941-0400 @unique_2dsphere_multikey_index.js:1:2
[js_test:unique_2dsphere_multikey_index] 2017-09-28T20:37:37.941-0400 failed to load: unique_2dsphere_multikey_index.js

Participants:

 Description   

As mentioned in SERVER-23533, the key generation code for 2dsphere indexes produces keys by taking the Cartesian product of all distinct values (after expanding arrays) for each field. This means a 2dsphere index requires that all combinations of individual values for each field are unique rather than only the combinations of individual values for each field that appear together as array elements.

Note: This issue was realized while discussing index key generation with eric.daniels@10gen.com and clarifying properties around multikey indexes.



 Comments   
Comment by Tess Avitabile (Inactive) [ 13/Oct/17 ]

Yes, as siyuan.zhou said, there is no technical reason why we take the Cartesian product of index key values for 2dsphere indexes. We could bump the index version, and change the index format for 2dsphere indexes.

I also confirmed that the unique constraint is applied to the geo field in the index key:

> db.c.createIndex({a: "2dsphere"}, {unique: true})
{
	"createdCollectionAutomatically" : true,
	"numIndexesBefore" : 1,
	"numIndexesAfter" : 2,
	"ok" : 1
}
> db.c.insert({a: {type: "Point", coordinates: [0, 0]}})
WriteResult({ "nInserted" : 1 })
> db.c.insert({a: {type: "Point", coordinates: [0, 0]}})
WriteResult({
	"nInserted" : 0,
	"writeError" : {
		"code" : 11000,
		"errmsg" : "E11000 duplicate key error collection: test.c index: a_2dsphere dup key: { : 1152921504606846977 }"
	}
})

I did not find a duplicate of this ticket.

Comment by Siyuan Zhou [ 12/Oct/17 ]

I don't think there's any reason the logic should different for compound 2dsphere indexes.

However, a separate question is - what does uniqueness mean for 2dsphere indexes. 2dsphere index is fundamentally multikey and non-unique index be design and totally opaque to users. If two polygons overlap with an arbitrary same cell and share one same index key, so what? The two polygons may overlap or be far away from each other. If it means the user just wants to enforce the uniqueness for non-geo parts of compound geo indexes, then we should make the logic consistent. As tess.avitabile pointed out, we may need to bump the 2dsphere index version to support back-compatibility.

Comment by Ian Whalen (Inactive) [ 29/Sep/17 ]

tess.avitabile to check whether this is a duplicate or not and investigate whether there's some underlying design changes to suggest here.

Generated at Thu Feb 08 04:26:39 UTC 2024 using Jira 9.7.1#970001-sha1:2222b88b221c4928ef0de3161136cc90c8356a66.