[SERVER-12645] bulk insert executability issues Created: 06/Feb/14  Updated: 10/May/22

Status: Backlog
Project: Core Server
Component/s: Shell
Affects Version/s: None
Fix Version/s: None

Type: Bug Priority: Major - P3
Reporter: Eric Milkie Assignee: DO NOT USE - Backlog - Platform Team
Resolution: Unresolved Votes: 0
Labels: move-sa, platforms-re-triaged
Remaining Estimate: Not Specified
Time Spent: Not Specified
Original Estimate: Not Specified

Issue Links:
Duplicate
duplicates SERVER-12576 Human readable Bulk state and SingleW... Closed
Related
related to SERVER-13430 Bulk API should prevent additional op... Backlog
Operating System: ALL
Steps To Reproduce:

> db.adminCommand({configureFailPoint: "checkForInterruptFail", mode: "alwaysOn", data: {conn: 1, chance: .7, allowNested: true}} )
{ "ok" : 1 }
// (or something similar to cause an error whilst inserting the bulk)
 
> var bulk = db.coll.initializeOrderedBulkOp();
> for (var i=0; i<1000000; i++) { bulk.insert({i:i}); }
> bulk.execute()
> bulk.execute()
(etc)

Participants:

 Description   

The Bulk API in the shell has been written such that once a valid response has been delivered from .execute(), the bulk cannot be re-executed; you get this error:

> bulk.execute()
2014-02-06T13:50:58.157-0500 batch cannot be re-executed at src/mongo/shell/bulk_api.js:815

(It should say "bulk" instead of "batch" in this message)

However, if an error (such as a user killOp) happens that prevents the bulk execute from delivering a final report, you ARE allowed to re-execute the bulk, even though this will almost never work, since _id objects have (apparently) already been assigned by the shell prior to .execute():

> bulk.execute()
2014-02-06T13:44:31.035-0500 batch failed, cannot aggregate results: operation was interrupted at src/mongo/shell/bulk_api.js:612
> bulk.execute()
BulkWriteResult({
	"writeErrors" : [
		{
			"index" : 0,
			"code" : 11000,
			"errmsg" : "insertDocument :: caused by :: 11000 E11000 duplicate key error index: test.coll.$_id_  dup key: { : ObjectId('52f3d7e1d9f831437fd3fc84') }",
			"op" : {
				"_id" : ObjectId("52f3d7e1d9f831437fd3fc84"),
				"i" : 0
			}
		}
	],
	"writeConcernErrors" : [ ],
	"nInserted" : 999,
	"nUpserted" : 0,
	"nUpdated" : 0,
	"nModified" : 0,
	"nRemoved" : 0,
	"upserted" : [ ]
})

Note how the second attempt at running the bulk insert fails with a unique index constraint violation on the _id index, which would only happen if it were trying to re-insert the same documents with the same _id's as before.



 Comments   
Comment by Steven Vannelli [ 10/May/22 ]

Moving this ticket to the Backlog and removing the "Backlog" fixVersion as per our latest policy for using fixVersions.

Comment by Daniel Pasette (Inactive) [ 10/Feb/14 ]

scotthernandez, in my limited testing it is preventing re-execution of the bulk after errors as well as successful ops. however it incorrectly lists the nBatches and nInsertOps as 2 instead of 1.

> db.coll.insert({_id:1})
> var bulk = db.coll.initializeOrderedBulkOp();
> bulk.insert({_id:1})
> bulk
{ "nInsertOps" : 1, "nUpdateOps" : 0, "nRemoveOps" : 0, "nBatches" : 1 }
> bulk.execute()
BulkWriteResult({
	"writeErrors" : [
		{
			"index" : 0,
			"code" : 11000,
			"errmsg" : "insertDocument :: caused by :: 11000 E11000 duplicate key error index: test.coll.$_id_  dup key: { : 1.0 }",
			"op" : {
				"_id" : 1
			}
		}
	],
	"writeConcernErrors" : [ ],
	"nInserted" : 0,
	"nUpserted" : 0,
	"nUpdated" : 0,
	"nModified" : 0,
	"nRemoved" : 0,
	"upserted" : [ ]
})
> bulk
{ "nInsertOps" : 2, "nUpdateOps" : 0, "nRemoveOps" : 0, "nBatches" : 2 }

Comment by Daniel Pasette (Inactive) [ 10/Feb/14 ]

fixed with SERVER-12576.

Comment by Scott Hernandez (Inactive) [ 06/Feb/14 ]

If this was anything other inserts (with _ids) you would not want to allow re-execution, and even in this case it is highly questionable. This should be fixed to disallow calling execute more than once, ind. of any response.

Generated at Thu Feb 08 03:29:08 UTC 2024 using Jira 9.7.1#970001-sha1:2222b88b221c4928ef0de3161136cc90c8356a66.