[JAVA-3854] ObjectId Serialization incompatibility Created: 08/Oct/20  Updated: 28/Oct/23  Resolved: 21/Oct/20

Status: Closed
Project: Java Driver
Component/s: BSON
Affects Version/s: 3.8.0
Fix Version/s: 4.2.0

Type: Bug Priority: Major - P3
Reporter: Yair Wasserman Assignee: Jeffrey Yemin
Resolution: Fixed Votes: 0
Labels: None
Remaining Estimate: Not Specified
Time Spent: Not Specified
Original Estimate: Not Specified

Issue Links:
Related
Backwards Compatibility: Minor Change

 Description   

In mongo-java-driver 3.8.0, ObjectId has a serialVersionUID=3670079982654483072L; The fields are: timestamp, machineIdentifier, processIdentifier, and counter.

In [mongo-java-driver 3.12.7|https://mongodb.github.io/mongo-java-driver/3.12/javadoc/org/bson/types/ObjectId.html], ObjectId has the same serialVersionUID=3670079982654483072L; However, the fields are different: timestamp, randomValue1, randomValue2, and counter.

The following scenario causes an issue: A BasicDBObject containing an ObjectId from version 3.8.0 is serialized into a byte array. Then that byte array is deserialized in a Java process containing mongo-java-driver 3.12.7, Java believes that the 2 ObjectIds contain the same fields due to the same serialVersionUID and therefore looks for the timestamp, randomValue1, randomValue2, and counter fields in the byte array. The randomValue1 and randomValue2 fields are not found (as the fields are really machineIdentifier and processIdentifier from 3.8.0) and thus, the middle 5 bytes are ignored and set to zeros. This causes the ObjectIds to be incorrect and causes a number of problems.

As a result, we are unable to upgrade mongo-java-driver from 3.8.0 to 3.12.7 because the ObjectIds are corrupted. Please fix! 



 Comments   
Comment by Githook User [ 21/Oct/20 ]

Author:

{'name': 'Jeff Yemin', 'email': 'jeff.yemin@mongodb.com', 'username': 'jyemin'}

Message: Update serialization of ObjectId to a stable interchange format

The existing serialization format for ObjectId is the default format provided by the
ObjectOutputStream specification, which just serializes each field of the class. But
in the 3.8.0 release we changed the fields, but neglected to even change the
serialVersionUID of the class, which makes any instances of ObjectId serialized prior
to 3.8.0 dangerously incompatible with subsequent driver releases.

This change updates the serialVersionUID for ObjectId, and also defines the serialized
form using a proxy via the writeReplace method. This means that serialized instances
of ObjectId prior to this change will be incompatible, but at least an exception
will be thrown indicating the incompatibility. And now we have the freedom to change
the private representation of ObjectId without affecting serialization format in the
future.

JAVA-3854
Branch: master
https://github.com/mongodb/mongo-java-driver/commit/57a20d675e0cdd1ae378b1b9ce969612953546a6

Comment by Yair Wasserman [ 21/Oct/20 ]

Hi Jeffrey,

Thank you very much for the offer. 

The PR looks fantastic!

Thank you so much for all you have done!

Comment by Jeffrey Yemin [ 21/Oct/20 ]

ywasserman@ptc.com let me know if you want to have a look at the PR before I merge it.

Comment by Jeffrey Yemin [ 20/Oct/20 ]

https://github.com/mongodb/mongo-java-driver/pull/601

Comment by Yair Wasserman [ 20/Oct/20 ]

That would be fantastic, Jeffrey.
We can wait for 4.2.0.

Thank you very much!

Comment by Jeffrey Yemin [ 20/Oct/20 ]

Patching version 4.1 poses the same issue as patch 3.12, so we will have to wait for 4.2.0.

We don't have a release date yet, but I anticipate sometime in the month of November. And we will release a beta before that (in early November) that will be stable.

Comment by Yair Wasserman [ 20/Oct/20 ]

Thank you very much for all your help, Jeffrey.
When is the expected release date for 4.2.0?
Would it be possible to patch version 4.1 instead of waiting until the future 4.2.0 release?
Thanks!

Comment by Jeffrey Yemin [ 19/Oct/20 ]

OK, I think we can manage that.

However, I don't feel comfortable doing this in a patch release to 3.12, and there are no further minor releases (i.e. 3.13.0) scheduled for the 3.x branch. Would you be able to to upgrade all the way to a future 4.2.0 release?

Comment by Yair Wasserman [ 19/Oct/20 ]

Hi Jeffrey,

Yes, that's right.

Comment by Jeffrey Yemin [ 17/Oct/20 ]

Hi ywasserman@ptc.com

Is your idea that if the serialVersionUID were updated then at least deserialization would fail and you could throw away the cached object?

Comment by Yair Wasserman [ 16/Oct/20 ]

Hello Jeffrey,

Thank you for your swift response. Unfortunately, I don't think that using BsonDocument as a replacement for BasicDBObject is a viable option.

Our application uses a caching service in combination with mongodb, and Java serialization is used when storing/retrieving data to/from cache. Our use case is that we have BasicDBObject containing an ObjectId from version 3.8 that has already been Java serialized and is in the cache. Then, that serialized BasicDBObject is retrieved from the cache in its serialized format by a process that is using the new Java-Mongo-Driver version of 3.12. BasicDBObject deserializes fine, but ObjectId is the issue.

It would help our use case if you released a patch where the serialVersionUID of the ObjectId class were updated. ObjectId implements Serializable and according to the [Java docs|https://docs.oracle.com/en/java/javase/11/docs/specs/serialization/version.html], deleting fields is an incompatible change which may impair the ability of the earlier version to fulfill its contract. Thus, it would only make sense for the serialVersionUID to be updated as it is the responsibility of the evolved class to maintain the contract established by the nonevolved class.

Thank you for your time and help in resolving this bug.

Comment by Jeffrey Yemin [ 14/Oct/20 ]

ywasserman@ptc.com thanks for bringing this to our attention, and our apologies for the trouble it's caused you.

I think the reason no one else has run into this yet is that applications typically just use BSON as the serialization format for BasicDBObject.  I'm curious what your use case is for using Java's Object Serialization mechanism instead.

Currently I'm not sure how this can be fixed in a way that wouldn't break applications that are relying on the new serialized form of ObjectId.

A possible workaround is to use BsonDocument instead of BasicDBObject as the target of Object Serialization. The BsonDocument class was more carefully constructed to use a serialization proxy which uses BSON as the serialized form . It would be fairly straightforward to go from BasicDBObject to BsonDocument to serialized form to BsonDocument and back to BasicDBObject. Something like this:

import com.mongodb.BasicDBObject;
import com.mongodb.DBObject;
import com.mongodb.DBObjectCodec;
import org.bson.BsonDocument;
import org.bson.BsonDocumentReader;
import org.bson.BsonDocumentWriter;
import org.bson.codecs.DecoderContext;
import org.bson.codecs.EncoderContext;
import org.bson.types.ObjectId;
 
import java.io.ByteArrayInputStream;
import java.io.ByteArrayOutputStream;
import java.io.IOException;
import java.io.ObjectInputStream;
import java.io.ObjectOutputStream;
 
public class JAVA3854 {
    public static void main(String[] args) throws IOException, ClassNotFoundException {
        var input = new BasicDBObject("_id", new ObjectId());
 
        var baos = new ByteArrayOutputStream();
        var os = new ObjectOutputStream(baos);
        
        os.writeObject(toBsonDocument(input));
 
        var is = new ObjectInputStream(new ByteArrayInputStream(baos.toByteArray()));
        DBObject output = fromBsonDocument((BsonDocument) is.readObject());
 
        System.out.println(input.equals(output));
    }
 
    private static DBObject fromBsonDocument(final BsonDocument bsonDocument) {
        var reader = new BsonDocumentReader(bsonDocument);
        return new DBObjectCodec().decode(reader, DecoderContext.builder().build());
    }
 
    private static BsonDocument toBsonDocument(final BasicDBObject basicDBObject) {
        var retVal = new BsonDocument();
        var writer = new BsonDocumentWriter(retVal);
        new DBObjectCodec().encode(writer, basicDBObject, EncoderContext.builder().build());
        return retVal;
    }
}

Generated at Thu Feb 08 09:00:36 UTC 2024 using Jira 9.7.1#970001-sha1:2222b88b221c4928ef0de3161136cc90c8356a66.