[SERVER-27707] findAndModify breaks when I use the query operator "$in" and the update operator "$addToSet" on the same field. Created: 17/Jan/17  Updated: 27/Oct/23  Resolved: 06/Mar/17

Status: Closed
Project: Core Server
Component/s: Querying
Affects Version/s: 3.4.1
Fix Version/s: None

Type: Bug Priority: Major - P3
Reporter: Matthew Ruzicka Assignee: David Storch
Resolution: Works as Designed Votes: 2
Labels: None
Remaining Estimate: Not Specified
Time Spent: Not Specified
Original Estimate: Not Specified

Issue Links:
Documented
is documented by DOCS-9971 Document breaking change for single-e... Closed
Related
related to SERVER-21065 Optimizer should consider predicate w... Closed
related to SERVER-3946 Update with $addToSet/$push with a qu... Closed
Operating System: ALL
Sprint: Query 2017-02-13, Query 2017-03-27
Participants:

 Description   

The following command works in version 3.2.6, but not in 3.4.1:

db.testing.findAndModify({ query: { tags: { $in: ['y'] } }, update: { $addToSet: { tags: 'y' } }, upsert: true, new: true })

This is the error I get in 3.4.1:

QUERY    [main] Error: findAndModifyFailed failed: {
  "ok" : 0,
  "errmsg" : "Cannot apply $addToSet to a non-array field. Field named 'tags' has a non-array type string in the document INVALID-MUTABLE-ELEMENT",
  "code" : 16836,
  "codeName" : "Location16836"
}

If I change the query field from 'tags' to another name such as 'labels', it works. Also, I made sure to test this on completely new collections, so the result has nothing to do with existing documents in the database.

While this behavior seems similar to the bug documented here - https://jira.mongodb.org/browse/SERVER-3946 - I am reporting it since this behavior worked in a more recent version of MongoDB. Thanks.



 Comments   
Comment by Jason R. Coombs [ 06/Mar/17 ]

I tried creating a pure-JS version of the test, and I modified this from the jaraco.mongodb test suite, but it doesn't run on MongoDB 2.4 because it doesn't alter the tmp directory properly:

/** Test suite runner. */
function runTests() {
    var cases = [
        test_27707,
    ];
 
    cases.forEach(function(test) {
        var env = setUp();
        print("============================");
        print("       " + test.name);
        print("============================");
        test(env.rs1);
        tearDown(env);
    });
}
 
/** Initialize test environment. */
function setUp() {
    MongoRunner.dataPath = '/tmp/'
    opts = {nojournal: ''};
    var rs1 = new ReplSetTest({
        name: 'rs1',
        nodes: [opts],
        startPort: 31001,
    });
 
    rs1.startSet({oplogSize: 1})
    rs1.initiate();
    rs1.waitForMaster();
 
    return {rs1: rs1};
}
 
/** Clean up after the tests. */
function tearDown(env) {
    env.rs1.stopSet();
}
 
 
function set(db, who, value) {
    var query = {'names': {'$elemMatch': {'$eq': who}}};
    var op = {'$set': {'value': value}, '$addToSet': {"names": who}};
 
    db.karma.update(query, op, 1);
}
 
function test_27707(rs1) {
    var db = rs1.getPrimary().getDB("test");
    set(db, "sql", 2);
    set(db, "sql", -1);
 
    assert.eq(db.karma.count(), 1);
}
 
runTests();

But I wanted to share that in case it's easier for you to adapt that script than to try to run the Python script or extract from either script the underlying operations that fail on MongoDB 2.4.

Comment by Jason R. Coombs [ 06/Mar/17 ]

Here's an example pytest script that fails:

__requires__ = ['pytest', 'jaraco.mongodb']
 
 
import pytest
 
 
@pytest.fixture
def clean_db(mongodb_instance):
	conn = mongodb_instance.get_connection()
	conn.drop_database('test')
	return conn['test']
 
 
def set(db, who, value):
	query = {'names': {'$elemMatch': {'$eq': who}}}
	op = {'$set': {'value': value}, '$addToSet': {'names': who}}
	db.karma.update(query, op, upsert=True)
 
 
def test_upsert_eq(clean_db):
	db = clean_db
	assert db.karma.count() == 0
	set(db, 'sql', 2)
	set(db, 'sql', -1)
	assert db.karma.find({'names': 'sql'}).count() == 1

Install rwt, then invoke it with

MONGODB_HOME=/opt/mongodb-osx-x86_64-2.4.14 rwt -- -m pytest test-upsert-eq.py

Adjusting for the appropriate home for MongoDB 2.4. The test fails at the second assertion.

Switch to MongoDB 2.6.11 or change the

{$eq: N}

to

{$in: [N]}

and it passes.

This failure, which occurs when two records are present, is slightly different from the actual one that failed, but as I tried to distill the problem to its essential steps, I kept finding that the

$set

was necessarily implicated.

So what it boils down to is an $elemMatch using $eq with an upsert invoking a $set operation will not update an existing record but will create an additional record.

But as you say, that's MongoDB 2.4 only, and the only reason I encountered it is because that's what Travis CI has installed by default.

Comment by David Storch [ 06/Mar/17 ]

Hi jason.coombs@yougov.com, {$elemMatch: {$eq: 3}} and {$elemMatch: {$in: [3]}} should be identical in meaning. I don't think it should matter a whole lot which one you use, though I'd probably stick to $eq unless you have a reason not to. I'm surprised that their matching behavior is different on 2.4, but that version is now more than 3 years old and much of the query codebase has been almost entirely rewritten. Can you provide a small repro to demonstrate how the matching behavior for the two $elemMatch predicates is different on 2.4?

Comment by Jason R. Coombs [ 06/Mar/17 ]

Well, I've found at least one answer to my question. In the context of $elemMatch {$eq: 3} is different from {$in: [3]} in that only the latter will match on MongoDB 2.4. At least in our use case, the behavior is the same on MongoDB 2.6 and later for either match condition.

Comment by Jason R. Coombs [ 06/Mar/17 ]

Dave,

Thanks for this explanation. I think it clarifies a misunderstanding I had about the purpose of $in, perhaps derived from the behavior I observed in prior versions of MongoDB.

By using the $elemMatch technique, I was able to remove the previously-mentioned workaround.

I probably should have been using $elemMatch from the beginning, but was using $in because it had the same effect. Now that I understand that

{x: 3}

is meant to be the same as {x: {$in: [3]}}, I don't want to be using the latter form at all.

But in the case of $elemMatch, it's necessary to specify a document (it's not allowed to {$elemMatch: 3}). I've found instead I can use {x: {$elemMatch: {$eq: 3}}}. Is that also equivalent to {x: {$elemMatch: {$in: [3]}}}, and if so, does the $eq form seem better for cases like these?

Comment by David Storch [ 06/Mar/17 ]

Hi all,

After substantial internal discussion, we have decided to resolve this ticket as Works as Designed. Although this is a breaking change in 3.4, we believe the new behavior to be correct. When an {upsert: true} update results in an insert, the semantics are that all top-level equality predicates should seed the document to insert. The "spelling" of the equality predicate should make no difference. That is, {x: 3} and {x: {$in: [3]}} are different spellings of the same predicate and have the same matching behavior. Therefore, they should also have the same behavior for upserts. Making semantics depend on the spelling of a predicate is not only counterintuitive, but it ties the hands of us as implementers—for example, it makes it difficult to implement correct optimizations in the query engine.

I have entered DOCS-9971 to ensure that the 3.4 release notes are updated to describe this breaking change. My apologies that this was not documented more clearly.

In order to avoid this problem, you should wrap your $in predicate in an $elemMatch:

> db.c.drop();
true
> db.c.update({a: {$elemMatch: {$in: [2]}}}, {$addToSet: {a: 3}}, {upsert: true});
WriteResult({
	"nMatched" : 0,
	"nUpserted" : 1,
	"nModified" : 0,
	"_id" : ObjectId("58bda2706b1609466de3d5dc")
})
> db.c.find();
{ "_id" : ObjectId("58bda2706b1609466de3d5dc"), "a" : [ 3 ] }

Since queries like this always expect the field a to be an array, I imagine that the $elemMatch version of the problematic query is sufficient for nearly all use cases. You may get into trouble if your schema has a field which can be either an array or a scalar, but such mixed array/scalar fields are not advisable. In a mixed array/scalar schema, an $addToSet query such as the one reported here will already fail if the upsert operation happens to match a document containing a scalar.

We understand that this change may be burdensome for some users, so please let us know if you are running into difficulty.

Best,
Dave

Comment by Jason R. Coombs [ 07/Feb/17 ]

I've developed this workaround for the failure in the aforementioned usage. It requires two round trips, so will be slower and subject to race conditions, but it at least prevents the outright failure until the issue can be fixed in MongoDB.

Comment by Jason R. Coombs [ 07/Feb/17 ]

This issue should be added to the known issues. It's a blocker for our upgrade to 3.4 unless we engineer a workaround in the code to avoid the failure, which also occurs when $addToSet is invoked directly in an upsert operation, such as in the following.

var query = {'names': {'$in': ['mongodb']}}
var op = {'$inc': {'value': 1}, '$addToSet': {'names': 'mongodb'}}
db.karma.update(query, op, true)

Generated at Thu Feb 08 04:15:58 UTC 2024 using Jira 9.7.1#970001-sha1:2222b88b221c4928ef0de3161136cc90c8356a66.