-
Type: Task
-
Resolution: Incomplete
-
Priority: Major - P3
-
None
-
Affects Version/s: None
-
Component/s: None
-
None
-
Catalog and Routing
-
CAR Team 2024-03-04, CAR Team 2024-03-18, CAR Team 2024-04-01, CAR Team 2024-04-15
After SERVER-65666, resharding shouldn't add draining shards as recipients but in my run on 5.0.24, resharding added the draining shard as a recipient.
Steps to reproduce:
- Two shard cluster in Atlas - shard0 and shard1. I can share the logs if you'd like. Lmk.
- Shard1 is the primary shard of testDB
- test1TBCollection has an equal distribution of chunks on both shards
- Remove shard using Atlas UI
- Confirm shard1 is not visible in Atlas cluster builder i.e. it is draining
- Confirm that chunks are being moved to shard0 using sh.status()
- Run resharding
db.adminCommand({ reshardCollection: "testDB.test1TBCollection", key: {_id: 1}})
- Monitor resharding to confirm recipients and donors
Monitoring output lists both shards as donor and recipient:
Atlas [mongos] testDB> db.getSiblingDB("admin").aggregate([ { $currentOp: { allUsers: true, localOps: false } }, { $match: { type: "op", "originatingCommand.reshardCollection": "testDB.test1TBCollection" } }] )[ { shard: 'atlas-10zagv-shard-0', type: 'op', desc: 'ReshardingRecipientService bc0e3d6c-7f52-4b5e-9ab3-447444980604', op: 'command', ns: 'testDB.test1TBCollection', originatingCommand: { reshardCollection: 'testDB.test1TBCollection', key: { _id: 1 }, unique: false, collation: { locale: 'simple' } }, totalOperationTimeElapsedSecs: Long('19'), remainingOperationTimeEstimatedSecs: Long('2'), approxDocumentsToCopy: Long('56283'), documentsCopied: Long('100000'), approxBytesToCopy: Long('65119332'), bytesCopied: Long('115699281'), totalCopyTimeElapsedSecs: Long('18'), oplogEntriesFetched: Long('38'), oplogEntriesApplied: Long('0'), totalApplyTimeElapsedSecs: Long('0'), recipientState: 'cloning', opStatus: 'running', oplogApplierApplyBatchLatencyMillis: { '(-inf, 10)': { count: Long('0') }, '[10, 100)': { count: Long('0') }, '[100, 1000)': { count: Long('0') }, '[1000, 10000)': { count: Long('0') }, '[10000, inf)': { count: Long('0') }, totalCount: Long('0') }, collClonerFillBatchForInsertLatencyMillis: { '(-inf, 10)': { count: Long('1115') }, '[10, 100)': { count: Long('3') }, '[100, 1000)': { count: Long('7') }, '[1000, 10000)': { count: Long('0') }, '[10000, inf)': { count: Long('0') }, totalCount: Long('1125') } }, { shard: 'atlas-10zagv-shard-0', type: 'op', desc: 'ReshardingDonorService bc0e3d6c-7f52-4b5e-9ab3-447444980604', op: 'command', ns: 'testDB.test1TBCollection', originatingCommand: { reshardCollection: 'testDB.test1TBCollection', key: { _id: 1 }, unique: false, collation: { locale: 'simple' } }, totalOperationTimeElapsedSecs: Long('19'), countWritesDuringCriticalSection: Long('0'), totalCriticalSectionTimeElapsedSecs: Long('0'), donorState: 'donating-initial-data', opStatus: 'running' }, { shard: 'atlas-10zagv-shard-1', type: 'op', desc: 'ReshardingRecipientService bc0e3d6c-7f52-4b5e-9ab3-447444980604', op: 'command', ns: 'testDB.test1TBCollection', originatingCommand: { reshardCollection: 'testDB.test1TBCollection', key: { _id: 1 }, unique: false, collation: { locale: 'simple' } }, totalOperationTimeElapsedSecs: Long('18'), remainingOperationTimeEstimatedSecs: Long('-1'), approxDocumentsToCopy: Long('56283'), documentsCopied: Long('0'), approxBytesToCopy: Long('65119332'), bytesCopied: Long('0'), totalCopyTimeElapsedSecs: Long('18'), oplogEntriesFetched: Long('38'), oplogEntriesApplied: Long('0'), totalApplyTimeElapsedSecs: Long('0'), recipientState: 'cloning', opStatus: 'running', oplogApplierApplyBatchLatencyMillis: { '(-inf, 10)': { count: Long('0') }, '[10, 100)': { count: Long('0') }, '[100, 1000)': { count: Long('0') }, '[1000, 10000)': { count: Long('0') }, '[10000, inf)': { count: Long('0') }, totalCount: Long('0') }, collClonerFillBatchForInsertLatencyMillis: { '(-inf, 10)': { count: Long('0') }, '[10, 100)': { count: Long('0') }, '[100, 1000)': { count: Long('1') }, '[1000, 10000)': { count: Long('0') }, '[10000, inf)': { count: Long('0') }, totalCount: Long('1') } }, { shard: 'atlas-10zagv-shard-1', type: 'op', desc: 'ReshardingDonorService bc0e3d6c-7f52-4b5e-9ab3-447444980604', op: 'command', ns: 'testDB.test1TBCollection', originatingCommand: { reshardCollection: 'testDB.test1TBCollection', key: { _id: 1 }, unique: false, collation: { locale: 'simple' } }, totalOperationTimeElapsedSecs: Long('18'), countWritesDuringCriticalSection: Long('0'), totalCriticalSectionTimeElapsedSecs: Long('0'), donorState: 'donating-initial-data', opStatus: 'running' }]
- is related to
-
SERVER-65666 Do not create chunks on draining shards when sharding a new collection
- Closed