[SERVER-25763] addShard upserts discovered databases without locking, this could change a databases primary out from under it Created: 23/Aug/16 Updated: 06/Dec/22 Resolved: 19/Dec/19 |
|
| Status: | Closed |
| Project: | Core Server |
| Component/s: | Sharding |
| Affects Version/s: | None |
| Fix Version/s: | None |
| Type: | Bug | Priority: | Major - P3 |
| Reporter: | Spencer Brody (Inactive) | Assignee: | [DO NOT USE] Backlog - Sharding Team |
| Resolution: | Won't Fix | Votes: | 0 |
| Labels: | PM-1017 | ||
| Remaining Estimate: | Not Specified | ||
| Time Spent: | Not Specified | ||
| Original Estimate: | Not Specified | ||
| Assigned Teams: |
Sharding
|
| Operating System: | ALL |
| Participants: |
| Description |
|
During addShard we find any databases that already exist on the shard being added and attempt to add them to the cluster. Before we do that, we check that no databases that exist on the shard already exist on the cluster. There is no locking during these checks, however, so it's possible that a database could be created after the check and assigned to a different shard. If that happens, when we upsert the database document for the database discovered on the shard, we might wind up changing the 'primary' shard for that database out from under it, leading to data loss. |
| Comments |
| Comment by Sheeri Cabral (Inactive) [ 19/Dec/19 ] |
|
This is not pervasive enough for a fix. |