[SERVER-23678] Cached minor shard version does not match that of the persistent config metadata after moveChunk Created: 13/Apr/16  Updated: 19/Jul/16  Resolved: 29/Jun/16

Status: Closed
Project: Core Server
Component/s: Sharding
Affects Version/s: 3.0.11, 3.2.4, 3.3.4
Fix Version/s: 3.3.10

Type: Bug Priority: Minor - P4
Reporter: Kaloian Manassiev Assignee: Dianna Hohensee (Inactive)
Resolution: Done Votes: 0
Labels: None
Remaining Estimate: Not Specified
Time Spent: Not Specified
Original Estimate: Not Specified

Attachments: File cached_shard_metadata_matches_config.js    
Issue Links:
Depends
depends on SERVER-22659 Implement commitChunkMigration comman... Closed
Backwards Compatibility: Fully Compatible
Operating System: ALL
Sprint: Sharding 15 (06/03/16), Sharding 16 (06/24/16), Sharding 17 (07/15/16)
Participants:

 Description   

After the moveChunk command is run on a shard it will first bump the major version of the chunk it donated to a value that's greater than all chunks in the collection. When the migration completes, it will will bump both the major and minor versions of one other chunk in order to indicate to incoming requests that it no longer owns the donated chunk.

Because of the way the metadata cache is maintained, even though we write the new chunk version to the metadata, it does not get reflected in the shard's metadata cache. The problem is reproduced by the attached .js test, which runs split/move/mergeChunk commands and compares the cached metadata against what is written on the server and this test fails at the moveChunk step.

This does not constitute an actual correctness problem, because minor shard version differences do not require retargeting, but is confusing when inspecting the cached metadata state, so we should fix it.

The bug is reproducible in 3.0, 3.2 and 3.3.



 Comments   
Comment by Githook User [ 27/Jun/16 ]

Author:

{u'username': u'DiannaHohensee', u'name': u'Dianna Hohensee', u'email': u'dianna.hohensee@10gen.com'}

Message: SERVER-23678 adding check that minor version updates correctly after moveChunk with remaining chunk(s)
Branch: master
https://github.com/mongodb/mongo/commit/3f090b41e87f9c3a2978d6ce42a01e7ee06053c9

Comment by Kaloian Manassiev [ 08/Jun/16 ]

Yes, this sounds like a more appropriate solution instead of introducing a new test.

Comment by Dianna Hohensee (Inactive) [ 07/Jun/16 ]

kaloian.manassiev

I'm thinking it might be sufficient to add

assert.eq(1, newVersion.i, "The minor value in the shard version should be 1");

right here: https://github.com/mongodb/mongo/blob/master/jstests/sharding/migration_failure.js#L68 and not add a whole new JS test. I modified migation_failure.js a bit ago to pull out and check the shardVersion when I was fixing something for CommitChunkMigration.

Comment by Dianna Hohensee (Inactive) [ 27/May/16 ]

Could also add an update to migration_failure.js to check the minor version after a successful migration with control chunk.

Comment by Dianna Hohensee (Inactive) [ 26/May/16 ]

SERVER-22659 will fix this in master as a result of moving new version generation to the config. Will put in the JS test after SERVER-22659 gets the CommitChunkMigration command in.

3.2 and 3.0 branches will need separate fixes.

Generated at Thu Feb 08 04:04:09 UTC 2024 using Jira 9.7.1#970001-sha1:2222b88b221c4928ef0de3161136cc90c8356a66.