[SERVER-66749] Improve tenant migration donor and recipient currentOp output Created: 25/May/22  Updated: 29/Oct/23  Resolved: 16/Feb/23

Status: Closed
Project: Core Server
Component/s: None
Affects Version/s: None
Fix Version/s: 7.0.0-rc0

Type: Task Priority: Major - P3
Reporter: Esha Maharishi (Inactive) Assignee: Christopher Caplinger
Resolution: Fixed Votes: 0
Labels: None
Remaining Estimate: Not Specified
Time Spent: Not Specified
Original Estimate: Not Specified

Issue Links:
Duplicate
is duplicated by SERVER-73444 Tenant Migration should report meanin... Closed
Problem/Incident
Backwards Compatibility: Fully Compatible
Sprint: Server Serverless 2023-02-20
Participants:
Linked BF Score: 55

 Description   

Here is example currentOp output on the donor:

    {
      desc: 'tenant donor migration',
      migrationCompleted: false,
      instanceID: UUID("c71ed9d7-f8b8-4f99-a51e-f387c2c76d51"),
      tenantId: 'mtm',
      recipientConnectionString: 'mongodb://u:p@127.0.0.1:28000/?replicaSet=proxytest2&ssl=true',
      readPreference: { mode: 'primaryPreferred' },
      receivedCancellation: false,
      lastDurableState: 4,
      migrationStart: ISODate("2022-05-25T13:12:28.560Z"),
      startMigrationDonorTimestamp: Timestamp({ t: 1653484348, i: 25 }),
      blockTimestamp: Timestamp({ t: 1653485495, i: 1 }),
      commitOrAbortOpTime: { ts: Timestamp({ t: 1653486224, i: 1 }), t: Long("1") }
    }

and recipient:

    {
      desc: 'tenant recipient migration',
      instanceID: UUID("c71ed9d7-f8b8-4f99-a51e-f387c2c76d51"),
      tenantId: 'mtm',
      donorConnectionString: 'proxytest/host1.local.10gen.cc:27000,host2.local.10gen.cc:27010,host3.local.10gen.cc:27020',
      readPreference: { mode: 'primaryPreferred' },
      state: 2,
      dataSyncCompleted: false,
      migrationCompleted: false,
      numRestartsDueToDonorConnectionFailure: Long("0"),
      numRestartsDueToRecipientFailure: Long("0"),
      approxTotalDataSize: Long("36846854"),
      approxTotalBytesCopied: Long("36851562"),
      totalReceiveElapsedMillis: Long("1707019"),
      remainingReceiveEstimatedMillis: Long("-218"),
      databases: {
        databasesClonedBeforeFailover: 0,
        databasesToClone: 1,
        databasesCloned: 1,
        approxTotalDataSize: 36846854,
        approxTotalBytesCopied: 36851562,
        mtm_test: {
          clonedCollectionsBeforeFailover: 0,
          collections: 1,
          clonedCollections: 1,
          start: ISODate("2022-05-25T13:14:39.657Z"),
          end: ISODate("2022-05-25T13:15:02.294Z"),
          elapsedMillis: 22637,
          'mtm_test.foo': {
            documentsToCopy: 1674857,
            documentsCopied: 1675071,
            indexes: 1,
            insertedBatches: 3,
            start: ISODate("2022-05-25T13:14:39.659Z"),
            end: ISODate("2022-05-25T13:15:02.294Z"),
            elapsedMillis: 22635,
            receivedBatches: 4
          }
        }
      },
      startFetchingDonorOpTime: { ts: Timestamp({ t: 1653484348, i: 32 }), t: Long("1") },
      startApplyingDonorOpTime: { ts: Timestamp({ t: 1653484349, i: 2 }), t: Long("1") },
      dataConsistentStopDonorOpTime: { ts: Timestamp({ t: 1653484488, i: 13 }), t: Long("1") },
      cloneFinishedRecipientOpTime: { ts: Timestamp({ t: 1653484502, i: 22028 }), t: Long("1") },
      donorSyncSource: 'host1.local.10gen.cc:27000',
      receiveStart: ISODate("2022-05-25T13:12:28.780Z"),
      numOpsApplied: 75000
    }

We should at least update the "state" fields to contain a descriptive string rather than number values like "4" and "2". In particular, when the donor is waiting for donorForgetMigration, it'd be helpful for its state to say something like "Waiting for donorForgetMigration command".

It would also be nice name documentsToCopy and documentsCopied so that it's not surprising if documentsCopied is larger. Perhaps documentsToCopyAtStartOfClone and documentsCopiedIncludingOplogCatchup, if those are the right meanings?

We could also change the recipient's "dataSyncCompleted" field to "migrationCompleted". It's confusing that today dataSyncCompleted remains false even after the donor has made the commit decision, and only gets set to true when the recipient receives recipientForgetMigration.



 Comments   
Comment by Githook User [ 16/Feb/23 ]

Author:

{'name': 'Christopher Caplinger', 'email': 'christopher.caplinger@mongodb.com', 'username': 'UnicodeSnowman'}

Message: SERVER-66749: Improve tenant migration currentOp output
Branch: master
https://github.com/mongodb/mongo/commit/39d5379993cebb26882d82e42efcf96d04cdcbeb

Comment by Githook User [ 16/Nov/22 ]

Author:

{'name': 'Sviatlana Zuiko', 'email': 'sviatlana.zuiko@mongodb.com', 'username': 'szuiko'}

Message: Revert "SERVER-66749: Improve tenant migration currentOp output"

This reverts commit d608948720384e57ece7eb27513fcb338d24a34e.
Branch: master
https://github.com/mongodb/mongo/commit/4d7a3e3d48b24a81e11741415100e4cfabce8692

Comment by Githook User [ 16/Nov/22 ]

Author:

{'name': 'Christopher Caplinger', 'email': 'christopher.caplinger@mongodb.com', 'username': 'UnicodeSnowman'}

Message: SERVER-66749: Improve tenant migration currentOp output
Branch: master
https://github.com/mongodb/mongo/commit/d608948720384e57ece7eb27513fcb338d24a34e

Generated at Thu Feb 08 06:06:19 UTC 2024 using Jira 9.7.1#970001-sha1:2222b88b221c4928ef0de3161136cc90c8356a66.