[SERVER-3319] insert bulk doesn't fail on duplicate, but also doesn't change the data Created: 23/Jun/11  Updated: 12/Jul/16  Resolved: 19/Sep/11

Status: Closed
Project: Core Server
Component/s: Concurrency, Sharding
Affects Version/s: 1.8.2, 2.0.0
Fix Version/s: 2.1.0

Type: Bug Priority: Critical - P2
Reporter: ofer fort Assignee: Spencer Brody (Inactive)
Resolution: Done Votes: 1
Labels: concurrency, mongos, sharding
Remaining Estimate: Not Specified
Time Spent: Not Specified
Original Estimate: Not Specified
Environment:

CentOS 5.5, 64GB RAM, HW RAID10 w/BBU , XFS for mongo data folder (mount with nobarriers,noatime)


Attachments: Text File MongoTester.java    
Operating System: Linux
Participants:

 Description   

sometimes running a bulk insert right after an upsert(with the $set command) will result in the document containing only the data from the upsert and not the data from the insert, and the insert doesn't throw duplicate exception.
I wasn't able to reproduce on windows environment. only on linux shard.
The collection is sharded, the cluster has 3 shards, each with has a replica set. connection is using mongos.



 Comments   
Comment by auto [ 19/Sep/11 ]

Author:

{u'login': u'stbrody', u'name': u'Spencer T Brody', u'email': u'spencer@10gen.com'}

Message: Fix bug where bulk inserts on sharded collection were sent as a series of single inserts. SERVER-3319.
Branch: master
https://github.com/mongodb/mongo/commit/c85b73a1b0ebdf115f0818bf22f352f92a2c6121

Comment by Spencer Brody (Inactive) [ 14/Sep/11 ]

There may a be a very slight performance improvement, but it will be very small - probably unnoticeable in most cases - because the writes were asynchronous anyway.

Comment by ofer fort [ 13/Sep/11 ]

great, i hope it would also improve performance?

Comment by Spencer Brody (Inactive) [ 13/Sep/11 ]

The problem is that when doing a multi-insert with sharding, the mongos sends it as a series of single inserts to each shard. The fix should be to group documents based on their destination shard, then send multi-inserts to each shard.

Comment by ofer fort [ 13/Sep/11 ]

Great!
Let me know if I can assist in anything

ב-13 בספט 2011, בשעה 01:52, "Spencer Brody (JIRA)" <jira@mongodb.org> כתב/ה:

[
https://jira.mongodb.org/browse/SERVER-3319?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=53567#comment-53567]

Spencer Brody commented on SERVER-3319:
---------------------------------------

I have successfully reproduced the problem. Tomorrow I will dig into trying
to figure out the cause.

insert bulk doesn't fail on duplicate, but also doesn't change the data

-----------------------------------------------------------------------

Key: SERVER-3319

URL: https://jira.mongodb.org/browse/SERVER-3319

Project: Core Server

Issue Type: Bug

Components: Concurrency, sharding

Affects Versions: 1.8.2

Environment: CentOS 5.5, 64GB RAM, HW RAID10 w/BBU , XFS for mongo
data folder (mount with nobarriers,noatime)

Reporter: ofer fort

Assignee: Spencer Brody

Priority: Critical

Labels: concurrency, mongos, sharding

Fix For: debugging with submitter

Attachments: MongoTester.java

sometimes running a bulk insert right after an upsert(with the $set command)
will result in the document containing only the data from the upsert and not
the data from the insert, and the insert doesn't throw duplicate exception.

I wasn't able to reproduce on windows environment. only on linux shard.

The collection is sharded, the cluster has 3 shards, each with has a replica
set. connection is using mongos.


This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira

Comment by Spencer Brody (Inactive) [ 12/Sep/11 ]

I have successfully reproduced the problem. Tomorrow I will dig into trying to figure out the cause.

Comment by ofer fort [ 12/Sep/11 ]

yes, you are right.
my shard key is _id, which is a sequential number i'm setting.

Comment by Spencer Brody (Inactive) [ 12/Sep/11 ]

Yes, of the environment where you got this attached test case to fail.

Just to confirm, with this test case, if I see "insert failed due to duplicate!!Unable to render embedded object: File (" then things are working as intended, but if I see "OUR PROPERTIES ARE NOT THERE) not found.!!" then I've reproduced the bug?

What is your shard key? _id?

Comment by ofer fort [ 12/Sep/11 ]

of the test environment? it might take some time till i can set it up again

Comment by Spencer Brody (Inactive) [ 12/Sep/11 ]

Can you attach the output of db.printShardingStatus()?

Comment by Erez Zarum [ 23/Jun/11 ]

I have managed to reproduce this error in a test environment.
tried with 3 config servers (after that with 1 config server), 3 shards and one mongos.
I have also tried with both replicasets and without replicasets (per shard).
The only error i can see is this:
Thu Jun 23 14:15:59 [conn4] delete failed b/c of StaleConfigException, retrying left:4 ns: storage.object patt: { _id:

{ $lt: 15 }

}
Using a non-sharded server (stand-alone mongod) it works as expected (i get failed due to duplicate)

Generated at Thu Feb 08 03:02:44 UTC 2024 using Jira 9.7.1#970001-sha1:2222b88b221c4928ef0de3161136cc90c8356a66.