Loading...

XML

Word

Printable

JSON

Type: Bug
Resolution: Fixed
Priority: Major - P3
Fix Version/s: 8.1.0-rc0, 8.0.0-rc8, 7.3.4, 7.0.13, 6.0.17, 5.0.31
Affects Version/s: 5.0.0, 6.0.0, 7.0.0, 7.3.0, 8.0.0-rc2
Component/s: Sharding
Labels:

Assigned Teams:

Cluster Scalability
Backwards Compatibility:
Fully Compatible
Operating System:
ALL
Backport Requested:

v8.0, v7.3, v7.0, v6.0, v5.0
Case:
CAR Domain/s:
None

Aha! Reference:
None
Tracking Level:
None
Risk Status:
None
Exec Notes:
None
Goal Name(s):
None
Goal Link:
None

Issue Status as of March 3, 2025

ISSUE DESCRIPTION AND IMPACT
Retryable writes occurring during resharding may be applied more than once if a specific chunk migration follows resharding’s commit. This may manifest as inconsistencies from the perspective of a client application.

This issue can only manifest if ALL of the following are true:

A retryable write is performed while a reshardCollection operation is in progress.
A chunk containing documents affected by the retryable write is migrated to a shard which was not the owner of those documents under the original shard key.
- Note that this means a cluster must contain 3 or more shards to be affected.
The retryable write must actually be retried (for example, due to a network error) after resharding and the subsequent chunk migration commit.

If the issue occurs and a retryable write is performed again, the application-facing impact could take one of the following forms:

Sequence of Operations	Expected Outcome	Actual Outcome
Retryable insert {_id: 5} Retryable delete {_id: 5} Duplicate packet causes retry of insert	No document with _id: 5 is present	Document {_id:5} is present
Document {_id: 5, counter: 0} is present Retryable update with query {_id: 5} and update of {$inc: {counter: 1} } Duplicate packet causes retry of update	{_id: 5, counter: 1}	{_id: 5, counter: 2}
Retryable delete {_id: 5} Retryable insert {_id: 5} Duplicate packet causes retry of delete	Document {_id: 5} is present	No document with _id: 5 is present

DIAGNOSIS AND REMEDIATION
A script is available at https://github.com/mongodb/support-tools/tree/master/ca-118 that can be used to potentially rule out impact by this issue. This script will inspect the config.changelog to confirm that no chunk migrations occurred on a namespace that has been resharded for a 30 minute period following resharding. Note the following limitations of the script:

The config.changelog is a capped collection and therefore only has a limited amount of history. The script is unable to provide any insight beyond the limit of the changelog’s history.
The script can definitively determine that a cluster was not affected (during the history present in config.changelog, see the previous point).
The script can not definitively determine that a cluster was affected.

If the script reports that a cluster may have been affected or if the script is inconclusive, then you will need to inspect your data to ensure that it is consistent from the perspective of your application. The issue, if it occurs, will only affect retryable writes that operated on the resharded collection. With this in mind, pay specific attention to operations on the collection in question, particularly during the period immediately following resharding’s completion.

If your application meets the criteria above, we recommend that you upgrade to one of the following versions:

Affected Versions	Recommended Upgrade Versions
5.0.0 - 5.0.30	5.0.31+
6.0.0 - 6.0.16	6.0.20+
7.0.0 - 7.0.12	7.0.16+
8.0.0 - 8.0.3	8.0.5+

WORKAROUNDS
On affected versions, disabling the balancer for a period of time following resharding will also prevent the issue from occurring. The balancer should remain disabled for a length of time sufficient to ensure retryable writes from during the resharding operation will no longer be retried. The recommended minimum duration to disable the balancer following a resharding operation is 30 minutes, matching the default timeout for logical sessions.

Original description

Resharding preserves the full retryability history for any retryable writes which occur during the resharding operation. If a chunk migration follows the resharding, session migration should transfer the relevant write history over to the recipient of the chunk. The way chunk migration checks for whether an oplog is relevant is by filtering on the namespace being migrated.

The problem is that when resharding recipients update their config.transactions table (based on the retryable writes/transactions performed on the donor shard), it creates a noop oplog entry with the namespace set to empty. If the resharding recipient then becomes the donor in the following chunk migration, due to the empty namespace, it will incorrectly conclude that this oplog entry isn't relevant to the chunk actively being migrated. As a result, the noop oplog entry for the already executed retryable write never gets migrated and the retryable write could be executed again after the chunk migration commits.

Adding Max's repro for this issue:

Start a resharding operation
Run a retryable $inc update during the resharding operation
Resharding operation completes
Run chunk migration
Retry retryable write from (2) and verify no new oplog entry was generated

- - Sort By Name
  - Sort By Date
  - Ascending
  - Descending
  - Thumbnails
  - List
  - Download All

repro_reshard_then_move_chunk_with_retryable_write.js
3 kB
Apr 17 2024 06:22:02 PM UTC

is caused by

SERVER-49904 Update config.transactions entry for retryable writes during resharding's oplog application

Closed

is related to

SERVER-89452 Avoid adding empty namespaces to txnParticipant's affectedNamespaces

Closed

SERVER-55384 Move session application for resharding's oplog application into its own class

Closed

Assignee:: Ben Gawel
Reporter:: Kruti Shah
Participants:: Ben Gawel, Githook User, Kruti Shah, TPM Jira Automations Bot
Votes:: 0 Vote for this issue
Watchers:: 17 Start watching this issue

Created:: Apr 17 2024 06:05:04 PM UTC
Updated:: Mar 03 2025 08:47:57 PM UTC
Resolved:: Jun 05 2024 03:28:29 PM UTC

Details

Description

Original description

Attachments

Attachments

Issue Links

Activity

People

Dates