-
Type:
Improvement
-
Resolution: Unresolved
-
Priority:
Major - P3
-
None
-
Affects Version/s: None
-
Component/s: None
-
Cluster Scalability
-
None
-
None
-
None
-
None
-
None
-
None
-
None
Resharding has two UUIDs that can be used to identify an operation:
- The resharding UUID, randomly generated for each logical execution of a resharding coordinator.
- The user supplied resharding UUID, which a user can supply to support retryability of forceRedistribution: true resharding operations.
However, these are inconsistently referred to in the code. For example, _configsvrReshardCollection refers to the user supplied UUID as "reshardingUUID," which can lead to some confusing looking code where we need to consider both the internal and user supplied UUIDs simultaneously.
We should ensure that the user supplied UUID is consistently referred to as "user supplied," or otherwise make this distinction obvious everywhere.
The motivation for this ticket is that we are seeing some data quality issues in the resharding complete log ingestion and are considering if it's possible that we accidentally report the user supplied UUID (which is possible not to be unique between different logical resharding operations) instead of the internal UUID (which ought to be unique between different logical resharding operations) in some places. The way the code is written today, it's not obvious if this is being handled correctly.