[SERVER-10478] Very large documents can cause premature migration commit Created: 09/Aug/13  Updated: 11/Jul/16  Resolved: 12/Aug/13

Status: Closed
Project: Core Server
Component/s: Sharding
Affects Version/s: None
Fix Version/s: 2.2.6, 2.4.6, 2.5.2

Type: Bug Priority: Blocker - P1
Reporter: Greg Studer Assignee: Greg Studer
Resolution: Done Votes: 0
Labels: None
Remaining Estimate: Not Specified
Time Spent: Not Specified
Original Estimate: Not Specified

Issue Links:
Depends
Related
related to SERVER-10458 Sanity check on "from" side that all ... Closed
Operating System: ALL
Participants:

 Description   
Issue Status as of October 23rd, 2013

ISSUE SUMMARY
During a chunk migration, if one of the documents in the chunk has a size in the range of 16,776,185 and 16,777,216 bytes (inclusive), then some documents in that chunk may be lost during the migration process.

USER IMPACT
Documents which are not migrated from the chunk are lost and need to be reinserted into the collection.

MongoDB v2.2 maintains a backup of every document involved in a chunk migration in a moveChunk directory (http://docs.mongodb.org/manual/faq/sharding/). It is possible to examine this directory programmatically to find documents migrated within the document size in question.

MongoDB v2.4 has this option off by default.

SOLUTION
Mongod needs to ensure it always sends at least one doc until the batches are done for that chunk.

WORKAROUNDS
If there are very large documents in your cluster, you should disable the balancer until upgrading. See: http://docs.mongodb.org/manual/tutorial/manage-sharded-cluster-balancer/

If document loss is suspected, locate the moveChunk directory on the master replica of the donor shard at the time of the migration. The lost documents can be reinserted from that backup or your own regular backups.

PATCHES
MongoDB v2.2.6 and v2.4.6 will address this problem. Downloads for the release candidates will be available at http://www.mongodb.org/downloads within 24 hours.



 Comments   
Comment by Dmitry Kireev [ 20/Aug/13 ]

Thank you.
Please let us know as soon as the stable version will come out.

For now we'll disable balancer.

Thank you.

Comment by auto [ 12/Aug/13 ]

Author:

{u'username': u'tychoish', u'name': u'Sam Kleinman', u'email': u'samk@10gen.com'}

Message: SERVER-10478: adding moveParanoia option
Branch: master
https://github.com/mongodb/docs/commit/a7da07164cfc3c7c31b204c0466e28c92b41e86f

Comment by auto [ 12/Aug/13 ]

Author:

{u'username': u'gregstuder', u'name': u'Greg Studer', u'email': u'greg@10gen.com'}

Message: SERVER-10478 fix batch limit check for _cloneLocs in migration
Branch: v2.0
https://github.com/mongodb/mongo/commit/e5a7faaecdfa0b153db493ad15d624dc90c986b9

Comment by auto [ 12/Aug/13 ]

Author:

{u'username': u'gregstuder', u'name': u'Greg Studer', u'email': u'greg@10gen.com'}

Message: SERVER-10478 fix batch limit check for _cloneLocs in migration
Branch: v2.2
https://github.com/mongodb/mongo/commit/a8e94e832027d44a78b441f9efdc8352ce56834a

Comment by auto [ 10/Aug/13 ]

Author:

{u'username': u'gregstuder', u'name': u'Greg Studer', u'email': u'greg@10gen.com'}

Message: SERVER-10478 fix batch limit check for _cloneLocs in migration
Branch: v2.4
https://github.com/mongodb/mongo/commit/484fc234656308135234cfca7c184f8f8520c497

Comment by auto [ 09/Aug/13 ]

Author:

{u'username': u'gregstuder', u'name': u'Greg Studer', u'email': u'greg@10gen.com'}

Message: SERVER-10478 fix batch limit check for _cloneLocs in migration
Branch: master
https://github.com/mongodb/mongo/commit/f03d58d18bd379e35e0b17c2a5676aaf5dd7fb03

Generated at Thu Feb 08 03:23:16 UTC 2024 using Jira 9.7.1#970001-sha1:2222b88b221c4928ef0de3161136cc90c8356a66.