[SERVER-61105] The sessions migration logic during moveChunk pollutes the logs Created: 29/Oct/21  Updated: 29/Oct/23  Resolved: 06/Jan/22

Status: Closed
Project: Core Server
Component/s: Sharding
Affects Version/s: None
Fix Version/s: 5.3.0, 5.0.6

Type: Bug Priority: Major - P3
Reporter: Kaloian Manassiev Assignee: Matt Boros
Resolution: Fixed Votes: 0
Labels: sharding-nyc-subteam1
Remaining Estimate: Not Specified
Time Spent: Not Specified
Original Estimate: Not Specified

Issue Links:
Backports
Problem/Incident
is caused by SERVER-50871 Add logging to SessionCatalogMigratio... Closed
Backwards Compatibility: Fully Compatible
Operating System: ALL
Backport Requested:
v5.2, v5.0
Sprint: Sharding 2021-12-27, Sharding 2022-01-10
Participants:
Story Points: 2

 Description   

The sessions migration logic during moveChunk logs every time it finishes draining the active sessions with retryable writes on them.

This means that on a system, which doesn't have retryable writes for example or where the catch-up of the actual chunk data finishes slightly earlier than the sessions being drained (such as for example with multi-writes), it will log every few milliseconds a log line like this:

{"t":{"$date":"2021-10-14T20:03:06.094+00:00"},"s":"I",  "c":"SHARDING", "id":5087101, "ctx":"sessionCatalogMigrationProducer-delete_during_migrate-rs1_delete_during_migrate-rs0_61688cdce5032e7b7eef3557","msg":"Recipient finished draining oplog entries for retryable writes and transactions from donor for the first time, before receiving _recvChunkCommit","attr":{"namespace":"test.foo","migrationSessionId":"delete_during_migrate-rs1_delete_during_migrate-rs0_61688cdce5032e7b7eef3557","fromShard":"delete_during_migrate-rs1"}}

This is wasteful both from logging perspective and from the fact that it actively polls the donor shard with useless calls.

It would be good if some form of long-polling is implemented so that the donor is not queried actively if there are no sessions, but even if that is not implemented, at least the logging should happen only on transitions to "finished draining" rather than every time.



 Comments   
Comment by Githook User [ 06/Jan/22 ]

Author:

{'name': 'Matt Boros', 'email': 'matt.boros@mongodb.com'}

Message: SERVER-61105 Fixed log spam in moveChunk session migration

(cherry picked from commit 8e681e80447e5859f79296e10911ffec39f31efa)
Branch: v5.2
https://github.com/mongodb/mongo/commit/6223bd89b676b0ea8241547b19e31efbcffa2c8d

Comment by Githook User [ 06/Jan/22 ]

Author:

{'name': 'Matt Boros', 'email': 'matt.boros@mongodb.com'}

Message: SERVER-61105 Fixed log spam in moveChunk session migration

(cherry picked from commit 8e681e80447e5859f79296e10911ffec39f31efa)
Branch: v5.0
https://github.com/mongodb/mongo/commit/660480ddf90b1d4b61bab72dd307ff869ce59e2d

Comment by Githook User [ 06/Jan/22 ]

Author:

{'name': 'Matt Boros', 'email': 'matt.boros@mongodb.com'}

Message: SERVER-61105 Fixed log spam in moveChunk session migration
Branch: master
https://github.com/mongodb/mongo/commit/8e681e80447e5859f79296e10911ffec39f31efa

Comment by Max Hirschhorn [ 22/Nov/21 ]

Acceptance criteria:

  • In the absence of retryable writes and transactions, we shouldn't see any log spam from the sessionCatalogMigrationProducer thread.
  • Not going to attempt to make _getNextSessionMods a "tailable awaitData" command and will instead make this ticket only about not spamming the log message when there aren't new retryable writes to migrate config.transactions info for.
Generated at Thu Feb 08 05:51:32 UTC 2024 using Jira 9.7.1#970001-sha1:2222b88b221c4928ef0de3161136cc90c8356a66.