[SERVER-5040] Cloner can fail to create unique indexes on initial sync Created: 22/Feb/12  Updated: 11/Jul/16  Resolved: 18/May/12

Status: Closed
Project: Core Server
Component/s: Replication
Affects Version/s: None
Fix Version/s: 2.0.6, 2.1.2

Type: Bug Priority: Major - P3
Reporter: Kristina Chodorow (Inactive) Assignee: Eric Milkie
Resolution: Done Votes: 2
Labels: None
Remaining Estimate: Not Specified
Time Spent: Not Specified
Original Estimate: Not Specified

Issue Links:
Depends
depends on SERVER-3160 replication initial sync should use t... Closed
Related
related to SERVER-5840 Error on failed index build during in... Closed
related to SERVER-5963 exception cloning object in replset7.js Closed
is related to SERVER-4174 Replicaset resync skips building of _... Closed
Operating System: ALL
Participants:

 Description   

For example, suppose the primary has a unique index on x. The primary has a document {_id:1, x:4}, deletes this document, and {_id:10000000, x:4} is inserted. In the meantime, a clone is happening, so the secondary clones {_id:1, x:4} before the delete happens, then {_id:10000000, x:4} at the end of the clone. Then it tries to build a unique index on x, which fails, because there are two docs with x:4.



 Comments   
Comment by David Mytton [ 10/Sep/12 ]

This bug is still present in 2.0.7.

Comment by Eric Milkie [ 07/Jun/12 ]

We've since fixed this issue in a cleaner way; see the linked ticket SERVER-5963

Comment by Kevin Kwast [ 06/Jun/12 ]

This bug affects one of my replica sets where the cloner consistently brings over a duplicate. The set is big enough that it takes 10 hours to get through cloning and index creation until the unique index fails. Then I restart the new node in standalone and create the unique index manually.

If the additional cloning attempts hit the same problem, it sounds like a new node will take 30 hours to fail instead of 10, and might not be left in a state where I can even create the unique index manually?

Comment by auto [ 18/May/12 ]

Author:

{u'login': u'milkie', u'name': u'Eric Milkie', u'email': u'milkie@10gen.com'}

Message: SERVER-5040 retry initial sync if errors occur when creating indexes

If you clone a database and a document, due to an update, moves forward in memory, cloner might clone both the old and new document.
When this happens, creating a unique index might fail. This change restarts the clone when this happens, and will abort after 3 failed cloning attempts.
Branch: v2.0
https://github.com/mongodb/mongo/commit/6b0869166832d3c8f540e451e4b6bf21c1d876f8

Comment by auto [ 18/May/12 ]

Author:

{u'login': u'milkie', u'name': u'Eric Milkie', u'email': u'milkie@10gen.com'}

Message: SERVER-5040 retry initial sync if errors occur when creating indexes

If you clone a database and a document, due to an update, moves forward in memory, cloner might clone both the old and new document.
When this happens, creating a unique index might fail. This change restarts the clone when this happens, and will abort after 3 failed cloning attempts.
Branch: master
https://github.com/mongodb/mongo/commit/fdc9fead639e96af96a0ed33d67576043fa7bb6c

Comment by auto [ 18/May/12 ]

Author:

{u'login': u'milkie', u'name': u'Eric Milkie', u'email': u'milkie@10gen.com'}

Message: SERVER-5040 better test, use replsets instead of master/slave
Branch: master
https://github.com/mongodb/mongo/commit/161960373175779914c093be70fa06dde3b60909

Comment by auto [ 16/May/12 ]

Author:

{u'login': u'milkie', u'name': u'Eric Milkie', u'email': u'milkie@10gen.com'}

Message: SERVER-5040 test
Branch: master
https://github.com/mongodb/mongo/commit/76632d922ccd3be2fb4374ff5fdf1d8803be0c07

Comment by Scott Hernandez (Inactive) [ 15/May/12 ]

Linked in a new issue to not swallow the exception : SERVER-5840

Comment by Andy Schwerin [ 14/May/12 ]

We should at least abort the replica when this happens, in 2.0, as the result of this is not creating a replica.

Generated at Thu Feb 08 03:07:43 UTC 2024 using Jira 9.7.1#970001-sha1:2222b88b221c4928ef0de3161136cc90c8356a66.