[JAVA-1522] Replicaset Secondaries crashed on replica. The _id field cannot be changed from {_id: ObjectId('54419460e4b027b78ea3f929')} to {_id: null} Created: 18/Oct/14  Updated: 23/Jun/15  Resolved: 23/Jun/15

Status: Closed
Project: Java Driver
Component/s: Codecs
Affects Version/s: None
Fix Version/s: None

Type: Bug Priority: Major - P3
Reporter: Alberto Xaubet Assignee: Unassigned
Resolution: Cannot Reproduce Votes: 0
Labels: None
Remaining Estimate: Not Specified
Time Spent: Not Specified
Original Estimate: Not Specified

Attachments: Text File primary_log.txt     Text File secondary_log.txt    
Issue Links:
Related

 Description   

My both secondaries in the replica crashed with the following message:

Looks like the replica is trying to change the ObjectId of an entity to null. How is that possible?

[repl writer worker 7] ERROR: writer worker caught exception:  :: caused by :: 16836 The _id field cannot be changed from {_id: ObjectId('54419460e4b027b78ea3f929')} to {_id: null}. on: { ts: Timestamp 1413583968000|9, h: -4495067277291860919, v: 2, op: "i", ns: "P_4b2f27df-97b7-4556-bbcf-08d0e153782a.statistics", o: { _id: ObjectId('54419460e4b027b78ea3f929'), _acl: { owner: "5441939fe4b027b78ea3f90e", mode: 0 }, mn: 0, uid: "5441939fe4b027b78ea3f90e", ln: 0, hs: 9, d: "20141017221248", rm: 1, ls: "156", hv: "1.3.3", c: {}, ms: 4, vn: 0, h: 1000, trs: 24, _id: null, om: [ "544193c5e4b0e3604e126b29" ], hsr: 24, dn: 0, hsm: 0 } }



 Comments   
Comment by Jeffrey Yemin [ 23/Jun/15 ]

Closing as unreproducible, but please re-open if you have a reproducible test case.

Comment by Jeffrey Yemin [ 28/Oct/14 ]

Hi Alberto,

I moved this issue SERVER to JAVA since the problem is more related to the Java driver.

Can you provide sample code showing the driver returning a Map instead of an ObjectId as the value of _id?

Regards,
Jeff

Comment by Alberto Xaubet [ 28/Oct/14 ]

After downgrading the java driver the problem is happening any more.

the main difference i noticed between the drivers is that

when retrieving the _id from BasicDBObject in driver version 2.9.3 i'm getting a Map and in last driver i'm getting a ObjectId.

I haven't been able to test the problem again in a test environment is not easy to setup everything like production.

Comment by Asya Kamsky [ 24/Oct/14 ]

axaubet

Just wanted to check in with you - have you seen the problem recur since you downgraded the driver?

Also, did you ever get a chance to run the two queries in mongo shell on the primary of the affected replica set or set up a test environment with versions that triggered this occurrence?

Asya

Comment by Asya Kamsky [ 20/Oct/14 ]

Thank you axaubet, I will set this ticket to waiting for user input till then, since I haven't been able to reproduce this locally.

Even though you rolled back the driver, it would still help to get the results of queries I provided from one of the replica set members, if possible.

Asya

Comment by Alberto Xaubet [ 20/Oct/14 ]

As is production environment, i have rolled back the MongoDB driver to 2.9.3 (I'm not sure if the problem comes from the new driver) . I don't expect the issue will happen again in production (we had this setup for a long time without problems).

I'm going to setup a new staging environment and try to reproduce this situation with more log lvl enabled.

Comment by Asya Kamsky [ 19/Oct/14 ]

axaubet

Thank you for the logs. Would it be possible to give mongod one primary and secondaries an option to increasing log level? You can so do with:

db.adminCommand({setParameter:1, logLevel:1})

This will put more details about what happens into the logs.

In addition, could you run the following commands on the primary:

db.getSiblingDB("P_4b2f27df-97b7-4556-bbcf-08d0e153782a").getCollection("statistics").findOne({_id:ObjectId('544331fbe4b040efd6b43231')})
 
db.getSiblingDB("local").getCollection("oplog.rs").find( {$or:[ { "o": ObjectId('544331fbe4b040efd6b43231') },  { "o2": ObjectId('544331fbe4b040efd6b43231') } ] })

I'd like to see what the document looks like and what operations were performed on it that the format of oplog to the secondary would look this way.

Asya

Comment by Alberto Xaubet [ 19/Oct/14 ]

I'm using Jongo to store entities so i cannot provide the java code that insert as it only stores a Basic entity using Jongo. Maybe the test is to use Jongo to try to update an entity setting and _id value to null.

Jongo version was 1.1

I have attached the logs of a different moment to the issue but with the same error and crash.

As you can see primary doesn't report any problem. But automatically the secondaries crashed due to this oplog operation that cannot be processed.

Comment by Asya Kamsky [ 19/Oct/14 ]

Also, could you provide the Java code snippet that inserts the document into "P_4b2f27df-97b7-4556-bbcf-08d0e153782a.statistics" collection?

Comment by Asya Kamsky [ 19/Oct/14 ]

Could you attach or include a larger section of the log leading up to the error message?

Does the primary have any error when the operation is run?

Comment by Alberto Xaubet [ 19/Oct/14 ]

I have more information about this bug. Started happening after upgrading java driver to 2.12.4 before this version i was using 2.9.3

Comment by Alberto Xaubet [ 18/Oct/14 ]

I have checked the related entity on the primary node and obviously the entity exist with an objected not null. If the code has tried this operation of setting a null on the _id i understand that this operation is going to fail in the primary and should never be passed to the replicas or at least replicas mongos process should not die.

Generated at Thu Feb 08 08:54:49 UTC 2024 using Jira 9.7.1#970001-sha1:2222b88b221c4928ef0de3161136cc90c8356a66.