[SERVER-55519] Investigate missing _id index during collection cloning in tenant migration Created: 25/Mar/21  Updated: 05/May/21  Resolved: 05/May/21

Status: Closed
Project: Core Server
Component/s: None
Affects Version/s: None
Fix Version/s: None

Type: Bug Priority: Major - P3
Reporter: Lingzhi Deng Assignee: Lingzhi Deng
Resolution: Duplicate Votes: 0
Labels: pm-1791_non-cloud-blocking, pm-1791_other_required, post-rc0
Remaining Estimate: Not Specified
Time Spent: Not Specified
Original Estimate: Not Specified

Issue Links:
Depends
is depended on by SERVER-52713 [testing] Add stepdown/kill/terminate... Closed
Duplicate
duplicates SERVER-56002 listIndexes can read partial state fr... Closed
Related
related to SERVER-55169 Timeseries bucket collections is expe... Closed
related to SERVER-56002 listIndexes can read partial state fr... Closed
Operating System: ALL
Sprint: Repl 2021-04-19, Repl 2021-05-03, Repl 2021-05-17
Participants:
Linked BF Score: 130

 Comments   
Comment by Lingzhi Deng [ 28/Apr/21 ]

Looks like this is a dup of SERVER-56002

Comment by Lingzhi Deng [ 19/Apr/21 ]

Based on this log:

// Recipient decided to clone tenantMigrationTenantId_test.jstests_or3 with uuid 09549bd6-e57a-47e6-b302-07a4355011a7
[j2:rs1:n0] | 2021-04-18T02:09:13.150+00:00 D1 TENANT_M 21069   [TenantMigrationRecipientService-0] "Cloner running stage","attr":{"cloner":"TenantDatabaseCloner","stage":"listCollections"}
[j2:rs1:n0] | 2021-04-18T02:09:13.151+00:00 D2 TENANT_M 4881603 [TenantMigrationRecipientService-0] "Allowing cloning of collectionInfo","attr":{"info":{"name":"jstests_or3","type":"collection","options":{},"info":{"readOnly":false,"uuid":{"$uuid":"09549bd6-e57a-47e6-b302-07a4355011a7"}},"idIndex":{"v":2,"key":{"_id":1},"name":"_id_"}},"db":"tenantMigrationTenantId_test","tenantId":"tenantMigrationTenantId"}
[j2:rs1:n0] | 2021-04-18T02:09:13.151+00:00 D1 TENANT_M 21072   [TenantMigrationRecipientService-0] "Cloner finished running stage","attr":{"cloner":"TenantDatabaseCloner","stage":"listCollections"}
[j2:rs1:n0] | 2021-04-18T02:09:13.152+00:00 D1 TENANT_M 21069   [TenantMigrationRecipientService-0] "Cloner running stage","attr":{"cloner":"TenantCollectionCloner","stage":"count"}
[j2:rs1:n0] | 2021-04-18T02:09:13.155+00:00 D1 TENANT_M 21072   [TenantMigrationRecipientService-0] "Cloner finished running stage","attr":{"cloner":"TenantCollectionCloner","stage":"count"}
[j2:rs1:n0] | 2021-04-18T02:09:13.155+00:00 D1 TENANT_M 21069   [TenantMigrationRecipientService-0] "Cloner running stage","attr":{"cloner":"TenantCollectionCloner","stage":"checkIfDonorCollectionIsEmpty"}
 
// Donor dropped the same collection
[j2:rs0:n1] | 2021-04-18T02:09:13.155+00:00 I  COMMAND  518070  [conn51] "CMD: drop","attr":{"namespace":"tenantMigrationTenantId_test.jstests_or3"}
[j2:rs0:n1] | 2021-04-18T02:09:13.155+00:00 I  STORAGE  20314   [conn51] "dropCollection: storage engine will take ownership of drop-pending collection","attr":{"namespace":"tenantMigrationTenantId_test.jstests_or3","uuid":{"uuid":{"$uuid":"09549bd6-e57a-47e6-b302-07a4355011a7"}},"dropOpTime":{"ts":{"$timestamp":{"t":0,"i":0}},"t":-1},"commitTimestamp":{"$timestamp":{"t":0,"i":0}}}
[j2:rs0:n1] | 2021-04-18T02:09:13.155+00:00 I  STORAGE  20318   [conn51] "Finishing collection drop","attr":{"namespace":"tenantMigrationTenantId_test.jstests_or3","uuid":{"uuid":{"$uuid":"09549bd6-e57a-47e6-b302-07a4355011a7"}}}
[j2:rs0:n1] | 2021-04-18T02:09:13.156+00:00 I  STORAGE  22206   [conn51] "Deferring table drop for index","attr":{"index":"_id_","namespace":"tenantMigrationTenantId_test.jstests_or3","uuid":{"uuid":{"$uuid":"09549bd6-e57a-47e6-b302-07a4355011a7"}},"ident":"index-56-2496369000462074825","commitTimestamp":{"$timestamp":{"t":1618711753,"i":9}}}
[j2:rs0:n1] | 2021-04-18T02:09:13.156+00:00 I  STORAGE  22214   [conn51] "Deferring table drop for collection","attr":{"namespace":"tenantMigrationTenantId_test.jstests_or3","ident":"collection-55-2496369000462074825","commitTimestamp":{"$timestamp":{"t":1618711753,"i":9}}}
 
// Recipient failed with "no _id index" because the collection didn't exist on the donor after the drop above
[j2:rs1:n0] | 2021-04-18T02:09:13.165+00:00 W  TENANT_M 4884503 [TenantMigrationRecipientService-0] "No indexes found for collection while cloning","attr":{"namespace":"tenantMigrationTenantId_test.jstests_or3","uuid":{"uuid":{"$uuid":"09549bd6-e57a-47e6-b302-07a4355011a7"}},"source":"localhost:20501","tenantId":"tenantMigrationTenantId"}
[j2:rs1:n0] | 2021-04-18T02:09:13.165+00:00 I  TENANT_M 21077   [TenantMigrationRecipientService-0] "Non-retryable error occurred during cloner stage","attr":{"cloner":"TenantCollectionCloner","stage":"listIndexes","error":{"code":20,"codeName":"IllegalOperation","errmsg":"Found empty '_id' index spec but the collection is not specified with 'autoIndexId' as false, tenantId: tenantMigrationTenantId, namespace: tenantMigrationTenantId_test.jstests_or3"}}
[j2:rs1:n0] | 2021-04-18T02:09:13.165+00:00 E  TENANT_M 4881601 [TenantMigrationRecipientService-0] "Tenant collection clone failed","attr":{"namespace":"tenantMigrationTenantId_test.jstests_or3","error":"IllegalOperation: Found empty '_id' index spec but the collection is not specified with 'autoIndexId' as false, tenantId: tenantMigrationTenantId, namespace: tenantMigrationTenantId_test.jstests_or3","tenantId":"tenantMigrationTenantId"}
[j2:rs1:n0] | 2021-04-18T02:09:13.165+00:00 W  TENANT_M 4881501 [TenantMigrationRecipientService-0] "Tenant migration database clone failed","attr":{"dbName":"tenantMigrationTenantId_test","dbNumber":1,"totalDbs":1,"error":"IllegalOperation: IllegalOperation: Error cloning collection 'tenantMigrationTenantId_test.jstests_or3' :: caused by :: Found empty '_id' index spec but the collection is not specified with 'autoIndexId' as false, tenantId: tenantMigrationTenantId, namespace: tenantMigrationTenantId_test.jstests_or3","tenantId":"tenantMigrationTenantId"}
[j2:rs1:n0] | 2021-04-18T02:09:13.165+00:00 I  TENANT_M 4878501 [TenantMigrationRecipientService-0] "Tenant migration recipient instance: Data sync completed.","attr":{"tenantId":"tenantMigrationTenantId","migrationId":{"uuid":{"$uuid":"35c915f7-f9be-44fc-9f14-6ced79d97481"}},"error":{"code":20,"codeName":"IllegalOperation","errmsg":"IllegalOperation: Error cloning collection 'tenantMigrationTenantId_test.jstests_or3' :: caused by :: Found empty '_id' index spec but the collection is not specified with 'autoIndexId' as false, tenantId: tenantMigrationTenantId, namespace: tenantMigrationTenantId_test.jstests_or3"}}
...
    raise errors.ServerFailure("Tenant migration '" + str(migration_opts.migration_id) +
buildscripts.resmokelib.errors.ServerFailure: Tenant migration '35c915f7-f9be-44fc-9f14-6ced79d97481' with donor replica set 'rs0' aborted due to an error: {'state': 'aborted', 'abortReason': {'code': 20, 'codeName': 'IllegalOperation', 'errmsg': "Tenant migration recipient command failed :: caused by :: IllegalOperation: Error cloning collection 'tenantMigrationTenantId_test.jstests_or3' :: caused by :: Found empty '_id' index spec but the collection is not specified with 'autoIndexId' as false, tenantId: tenantMigrationTenantId, namespace: tenantMigrationTenantId_test.jstests_or3"}, 'ok': 1.0, '$clusterTime': {'clusterTime': Timestamp(1618711753, 18), 'signature': {'hash': b'\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00', 'keyId': 0}}, 'operationTime': Timestamp(1618711753, 18)}

And I think this is because of this logic. I think for tenant migration, we need to differentiate empty index spec and NamespaceNotFound. The tenant collection cloner should skip cloning this collection on NamespaceNotFound.

EDIT: Ah actually, I read it wrong. Since we run getIndexSpecs using UUID, it shouldn't be a problem because "NamespaceNotFound" is an error for UUID and we only return an empty list for NamespaceString.

Generated at Thu Feb 08 05:36:42 UTC 2024 using Jira 9.7.1#970001-sha1:2222b88b221c4928ef0de3161136cc90c8356a66.