Loading...

XML

Word

Printable

JSON

Type: Bug
Resolution: Fixed
Priority: Major - P3
Fix Version/s: 7.1.0-rc0, 7.0.3, 6.0.12, 5.0.23
Affects Version/s: 3.6.23, 4.0.28, 4.2.24, 7.1.0-rc0, 6.0.6, 4.4.22, 5.0.18, 7.0.0-rc3
Component/s: None
Labels:
None

Backwards Compatibility:
Fully Compatible
Operating System:
ALL
Backport Requested:

v7.0, v6.0, v5.0
Sprint:
Sharding EMEA 2023-06-26, Sharding EMEA 2023-07-10, Sharding EMEA 2023-07-24, Sharding EMEA 2023-08-07, Sharding EMEA 2023-08-21, Sharding EMEA 2023-09-04
Linked BF Score:
152
Confidence Status:
None
Work Order:
3
CAR Domain/s:
None

Aha! Reference:
None
Tracking Level:
None
Risk Status:
None
Exec Notes:
None
Goal Name(s):
None
Goal Link:
None

In ~~SERVER-30797~~, a majority write was added to the refresh path on primaries after fetching new routing information from the config server. This write ensured that the node which fetched the routing information was actually the majority primary, preventing incorrect filtering information from being applied in split brain scenarios.

This write was removed in ~~SERVER-35092~~ since it was believed to be unnecessary and was causing stalls when a refresh happened without a majority of nodes available.

However, the split brain scenario for which the majority write was added is still a problem, and since the removal of that write, it is possible to hit this again. The scenario is as follows

Suppose we have a 2 shard cluster with 3 nodes per shard where (min, 0) is on shard 0 and (0, max) is on shard 1 with one document in each chunk
Now a network partition separates the primary of shard 0 from the secondaries and one of those secondaries steps up (creating a split brain scenario)
Chunk (0, max) is moved back to shard 0
A mongoS that hasn't learned about the new primary on shard 0 routes a majority read to the old primary
The old primary (who still believes itself to be primary) fetches the new routing information from the config

In this case, the old primary will respond to the majority read using the newest filtering information but without ever having seen the chunk migration.

This can also affect secondaries who refresh via the node that believes itself to be primary, causing their filtering information to be ahead of the data they have.

The solution here is to add back in the majority noop write to the SSCCL. It will ensure that if new filtering information is found, it can only be used and sent to secondaries by the actual primary of the replica set.

- - Sort By Name
  - Sort By Date
  - Ascending
  - Descending
  - Thumbnails
  - List
  - Download All

repro.js
Jun 15 2023 08:21:43 AM UTC
2 kB
Allison Easton

causes

SERVER-80712 Avoid leaving the replica set shard partitioned at the end of `linearizable_read_concern.js`

Closed

SERVER-84623 Shard-local re-execution of a command might bubble up a misleading StaleConfig exception to the router

Closed

depends on

SERVER-78505 Database cache does not use the 'allowLocks' option correctly

Closed

SERVER-80183 Remove operationTime check from store_retryable_find_and_modify_images_in_side_collection.js

Closed

is caused by

SERVER-35092 ShardServerCatalogCacheLoader should have a timeout waiting for read concern

Closed

is depended on by

SERVER-79609 Fix `findAndModify_upsert.js` test to accept StaleConfig error

Closed

related to

SERVER-30797 Shard primaries must commit a majority write before using updated chunk routing tables

Closed

SERVER-79483 Investigate if tests should check operationTime being identical for retryable write responses

Closed

(1 is depended on by, 2 related to)

Assignee:: Allison Easton
Reporter:: Allison Easton
Participants:: Allison Easton, Githook User
Votes:: 0 Vote for this issue
Watchers:: 10 Start watching this issue

Created:: Jun 15 2023 07:40:46 AM UTC
Updated:: May 20 2025 10:27:51 AM UTC
Resolved:: Aug 22 2023 12:18:56 PM UTC
Confidence Status Last Update:: 21/Aug/23 1:49 PM

Details

Description

Attachments

Attachments

Issue Links

Forms

Activity

People

Dates