[SERVER-56718] findOneAndUpdate sometimes failing to update a document Created: 06/May/21  Updated: 25/May/21  Resolved: 25/May/21

Status: Closed
Project: Core Server
Component/s: None
Affects Version/s: None
Fix Version/s: None

Type: Bug Priority: Major - P3
Reporter: Leigh Jones Assignee: Edwin Zhou
Resolution: Done Votes: 0
Labels: None
Remaining Estimate: Not Specified
Time Spent: Not Specified
Original Estimate: Not Specified

Attachments: File repro.js    
Operating System: ALL
Participants:

 Description   

We have a PSA setup running Mongo 4.0.7.

We are using the node.js driver: "mongodb": "3.1.1"

In one process we create an object in the db via a call to one of our REST services. The call to insert the object is as follows:

const result = await client.db(dbname)
.collection(COLLECTION_NAME)
.insertOne(clone(alert), { writeConcern: { j: true } });

 
Once our rest service returns to the calling operation it calls another operation on the rest service passing in the id of the newly created object.
The first thing the service does is perform a read in mongo by id. This returns the object.
The service then starts a transaction, inserts into some other collections and finally performs a findOneAndUpdate (in the transaction):

findOneAndUpdate(query, update, options);

 
The query is just {id: value}, the same object that was used to retrieve the object earlier.
The update contains the following operations:

{"$inc":{"a":40,"b":1},"$set":{"c.d":1620309757604},"$addToSet":{"e":["f","g"]}}

For some reason, on some occasions the update fails to update anything and we see a response like:

{"lastErrorObject":{"n":0,"updatedExisting":false},"value":null,"ok":1,"operationTime":"6959177415704707439","$clusterTime":{"clusterTime":"6959177415704707439","signature":{"hash":"AAAAAAAAAAAAAAAAAAAAAAAAAAA=","keyId":0}}}

When this happens there are a lot of actions happening in Mongo although there is no  contention on this item.

The transaction is then committed successfully (all be it with this missing write).

Do you know my the item can be read but then in a transaction it cant be found to be updated?

Thanks in advance.



 Comments   
Comment by Edwin Zhou [ 25/May/21 ]

Hi leigh.jones@ripjar.com,

I'll close this since you were able to resolve the issue on your own. If you're still interested in this investigation, please take a look at my reproduction attempt and see if it reflects the implementation that caused your issue, and what changes need to be made to successfully reproduce it.

Best,
Edwin

Comment by Leigh Jones [ 07/May/21 ]

I think this can be closed now.

Comment by Leigh Jones [ 07/May/21 ]

I suspect this is caused because the transaction has:

supports: { causalConsistency: true }

which I think will mean its using read majority, and write majority but (sometimes) the initiall insert hasnt propagated to secondary node yet and so when using read majority it isnt visible. Is this correct?

Comment by Leigh Jones [ 07/May/21 ]

Performing a get by id within the context of the transaction does not find the record. Is there a different write concern ( other than w:1, j: true) I should be using here?

Why is this record not visible within my transaction? I'm not removing it as part of the transaction.

Comment by Leigh Jones [ 07/May/21 ]

As a further update If I get the object by id immediately after the failed update (but outside of a transaction) again the record is found

Comment by Leigh Jones [ 06/May/21 ]

In fact looking the all my code / json snippets they all seem to be corrupted to some degree. Please let me know if you need them elaborating

Comment by Leigh Jones [ 06/May/21 ]

const result = await client.db(dbname)
.collection(COLLECTION_NAME)
.insertOne(clone(alert),

{ writeConcern: Unknown macro: \{ j}

});

 

Should have read:

const result = await client.db(dbname)
.collection(COLLECTION_NAME)
.insertOne(clone(alert), { writeConcern:
{ j: true }});

Generated at Thu Feb 08 05:40:01 UTC 2024 using Jira 9.7.1#970001-sha1:2222b88b221c4928ef0de3161136cc90c8356a66.