[JAVA-4403] Trying to bulk load documents with complex _id gives error Created: 10/Nov/21  Updated: 28/Oct/23  Resolved: 11/Nov/21

Status: Closed
Project: Java Driver
Component/s: API
Affects Version/s: 4.0.0
Fix Version/s: 4.4.0

Type: Bug Priority: Major - P3
Reporter: Steve Jerman Assignee: Ross Lawley
Resolution: Fixed Votes: 0
Labels: external-user
Remaining Estimate: Not Specified
Time Spent: Not Specified
Original Estimate: Not Specified

Attachments: File client_counts_by_hour.json    
Issue Links:
Related
related to JAVA-1788 Insert method should return a result ... Closed

 Description   

Versions:

  • Spring Boot 2.5.6
  • Spring Data Mongo: 3.2.6
  • Mongo Driver: 4.2.3

I am trying to upgrade my application to latest Spring/Mongo APIs.

I am getting an error when I try to bulk load data saved using mongoexport.

Attached is one of the files that gives the error.

The error is:

 

2021-11-10 15:50:00.668 DEBUG 84853 --- [           main] org.mongodb.driver.protocol.command      : Sending command '{"insert": "client_counts_by_hour", "ordered": false, "bypassDocumentValidation": true, "$db": "jameson", "lsid": {"id": {"$binary": {"base64": "y1q10S0HRKaUCzpnqxLHhg==", "subType": "04"}}}, "documents": [{"_id": {"$oid": "618bea287d91ca1961b3e1cb"}, "attributes": {"5a04d33ea09f961635c03580": {"attribute": "5a04d33ea09f961635c03580", "count": 2, "sessionCount": 2, "totalAvgDwell": 240147.0, "totalDwell": 240147.0}, "all": {"attribute": "all", "count": 2, "sessionCount": 2, "totalAvgDwell": 240147.0, "totalDwell": 240147.0}, "frequent": {"attribute": "frequent", "count": 2, "sessionCount": 2, "totalAvgDwell": 240147.0, "totalDwell": 240147.0}, "probing": {"attribute": "probing", "count": 2, "sessionCount": 2, "totalAvgDwell": 240147.0, "totalDwell": 240147.0}, "returned": {"attribute": "returned", "count": 2, "sessionCount": 2, "totalAvgDwell": 240147.0, "totalDwell": 240147.0}}, "locations": {"5d418961c678980007efe353": {"attributes": {"5a04d33ea09f961635c03580": {"attribute": "5a04d3 ...' with request id 44 to database jameson on connection [connectionId{localValue:3, serverValue:3}] to server localhost:58134
2021-11-10 15:50:00.677 DEBUG 84853 --- [           main] org.mongodb.driver.protocol.command      : Execution of command with request id 44 completed successfully in 10.60 ms on connection [connectionId{localValue:3, serverValue:3}] to server localhost:58134
Error writing to client_counts_by_hour: Bulk write operation error on server localhost:58134. Write errors: [BulkWriteError{index=0, code=2, message='can't have multiple _id fields in one document', details={}}, BulkWriteError{index=1, code=2, message='can't have multiple _id fields in one document', details={}}, BulkWriteError{index=2, code=2, message='can't have multiple _id fields in one document', details={}}, BulkWriteError{index=3, code=2, message='can't have multiple _id fields in one document', details={}}, BulkWriteError{index=4, code=2, message='can't have multiple _id fields in one document', details={}}, BulkWriteError{index=5, code=2, message='can't have multiple _id fields in one document', details={}}, BulkWriteError{index=6, code=2, message='can't have multiple _id fields in one document', details={}}, BulkWriteError{index=7, code=2, message='can't have multiple _id fields in one document', details={}}, BulkWriteError{index=8, code=2, message='can't have multiple _id fields in one document', details={}}, BulkWriteError{index=9, code=2, message='can't have multiple _id fields in one document', details={}}]. 

 

Note that an extra _id is being generated/added by the driver and can be seen in the trace. So the server side error makes sense. I have been tracing through the code but so far haven't found where it gets added.

This is my code for reference. I am accessing the Mongo driver via the spring template..

   Scanner scanner = new Scanner(is);
    List<InsertOneModel<Document>> documents = new ArrayList<>();
    int count = 0;
 
 
    while (scanner.hasNext()) {
      documents.add(new InsertOneModel<>(Document.parse(scanner.nextLine())));
      count++;
      if (documents.size() > 1000) {
        try {
          template.getCollection(collectionName).bulkWrite(documents,new BulkWriteOptions().bypassDocumentValidation(true).ordered(false));
        } catch (Exception e) {
          System.err.println("Error writing to "+collectionName+ ": "+e.getLocalizedMessage());
        } finally {
          documents.clear();
        }
      }
    }
    try {
        template.getCollection(collectionName).bulkWrite(documents,new BulkWriteOptions().bypassDocumentValidation(true).ordered(false));
    } catch (Exception e) {
        System.err.println("Error writing to "+collectionName+ ": "+e.getLocalizedMessage());
        throw e;
    } finally {
      documents.clear();
    }
    System.out.println(count + " documents loaded for " + collectionName + " collection.");

 

 

 



 Comments   
Comment by Steve Jerman [ 11/Nov/21 ]

That should be OK..

From: Jeffrey Yemin (Jira) <jira@mongodb.org>
Date: Thursday, November 11, 2021 at 4:40 PM
To: Steve Jerman <steve@kloudspot.com>
Subject: [MongoDB-JIRA] Jeffrey Yemin mentioned you on JAVA-4403 (Jira)
https://jira.mongodb.org/s/en_USp8swtz-1988229788/6109/25/_/jira-logo-scaled.png
[cid:image001.png@01D7D71A.E6FCAEC0]
Jeffrey Yemin<https://jira.mongodb.org/secure/ViewProfile.jspa?name=jeff.yemin> mentioned you on [Bug] JAVA-4403<https://jira.mongodb.org/browse/JAVA-4403>

Re: Trying to bulk load documents with complex _id gives error<https://jira.mongodb.org/browse/JAVA-4403>

Hi Steve Jerman<https://jira.mongodb.org/secure/ViewProfile.jspa?name=steve%40kloudspot.com>,

Generally we don't backport bug fixed to previous minor releases, unless they are fixing security issues.

We are releasing 4.4.0 right now, so the best we can offer is that you upgrade to that release.
[Add Comment]<https://jira.mongodb.org/browse/JAVA-4403#add-comment>
Add Comment<https://jira.mongodb.org/browse/JAVA-4403#add-comment>

This message was sent from MongoDB's issue tracking system. To respond to this ticket, please login to jira.mongodb.org<https://jira.mongodb.org> using your JIRA, MongoDB Cloud Manager, or MongoDB Atlas credentials.

Comment by Jeffrey Yemin [ 11/Nov/21 ]

Hi steve@kloudspot.com,

Generally we don't backport bug fixed to previous minor releases, unless they are fixing security issues.

We are releasing 4.4.0 right now, so the best we can offer is that you upgrade to that release.

Comment by Steve Jerman [ 11/Nov/21 ]

is it possible for this to be fixed in the 4.2.X branch? Also any idea when it would make a release? Happy to test if a SNAPSHOT is available...

Comment by Jeffrey Yemin [ 11/Nov/21 ]

It looks like this regression was introduced by JAVA-1788, in the 4.0.0 release

Comment by Githook User [ 11/Nov/21 ]

Author:

{'name': 'Ross Lawley', 'email': 'ross.lawley@gmail.com', 'username': 'rozza'}

Message: Fix id reporting bug with _ids containing arrays (#823)

JAVA-4403
Branch: master
https://github.com/mongodb/mongo-java-driver/commit/2461aabdac1ad8c3d044585ec4a8342b8e2c687b

Comment by Steve Jerman [ 11/Nov/21 ]

OK... some more information.

I wrote a little test program...

	          MongoClientSettings settings =
	              MongoClientSettings.builder().build();
 
 
	          MongoClient client = MongoClients.create(settings);
	          
	          MongoDatabase db = client.getDatabase("test-one");
	          
				List<String> tt = new ArrayList<>();
				tt.add("one");
				tt.add("two");
				Document testDoc = new Document("_id", new Document("start", Instant.now())
						.append("finish", Instant.now().plus(1, ChronoUnit.DAYS)).append("tz", tt)).append("name",
								"testdoc");
	          
	          System.err.println(testDoc);	      	
	          List<InsertOneModel<Document>> documents = new ArrayList<>();
	          documents.add(new InsertOneModel<>(testDoc));
	          
	          db.getCollection("testdocs").bulkWrite(documents);

Looks like it fails with the appended array... which is valid as far as I know and used to work... 

Generated at Thu Feb 08 09:01:59 UTC 2024 using Jira 9.7.1#970001-sha1:2222b88b221c4928ef0de3161136cc90c8356a66.