[SERVER-50610] secondary_reads.js should not make assertions based on natural collection ordering Created: 28/Aug/20  Updated: 29/Oct/23  Resolved: 11/Sep/20

Status: Closed
Project: Core Server
Component/s: None
Affects Version/s: None
Fix Version/s: 4.8.0

Type: Bug Priority: Major - P3
Reporter: Louis Williams Assignee: Louis Williams
Resolution: Fixed Votes: 0
Labels: None
Remaining Estimate: Not Specified
Time Spent: Not Specified
Original Estimate: Not Specified

Issue Links:
Depends
Problem/Incident
Related
is related to SERVER-53481 FSM threads race in secondary_reads_w... Closed
Backwards Compatibility: Fully Compatible
Operating System: ALL
Sprint: Execution Team 2020-09-21
Participants:
Linked BF Score: 12

 Description   

The secondary_reads.js FSM test is an insert-only workload that reads from secondaries. It makes assertions that the documents read from the secondary do not contain 'holes', or discontinuities in the documents. The documents are are inserted in increasing order and incremented by values of 1. The expectation is that a collection scan + sort of the documents reveals no gaps in the data. The problem is that this is not guaranteed by non-snapshot reads.

When the test uses a readConcern: majority cursor, specifically, the read may use a timestamp T that occurs in the middle of a past-completed oplog batch. This is problematic because documents are not inserted in order on the secondary, and majority reads effectively have read-committed isolation (as do all non-snapshot reads). The cursor may periodically yield and update to read at a newer timestamp, T + N. This introduces the possibility of cursors missing documents that were committed after the initial read timestamp T, and before T + N. As a result, some documents visible at T are returned, and some documents visible at T + N are returned. 

Here is an example:

  • 10 documents are inserted on a primary with _id 1 through 10 and corresponding timestamps 1-10.
  • When applied on the secondary, these inserts are split up into 2 groups, odds and evens, across 2 threads.
  • After the batch completes, due to the way the inserts interleaved, the collection looks like this:
    •  RID 1: _id: 2
    •  RID 2: _id: 4
    •  RID 3: _id: 6
    •  RID 4: _id: 8
    •  RID 5: _id: 10
    •  RID 6: _id: 1
    •  RID 7: _id: 3
    •  RID 8: _id: 5
    •  RID 9: _id: 7
    •  RID 10: _id: 9
  • A reader starts a collection scan at Timestamp 5. They see documents with _id 2, 4, 1, 3, then yield routinely. When they recover from the yield, they start reading at Timestamp 10, but pick up reading at RID 8. They then see documents with _id 5, 7, 9, and hit the end of the collection and return.
  • In this example, the query fails to return documents with _id 6, 8, and 10.

In general, this is a problem for the readConcern 'local' and 'available' parts of the test, but because these always occur on batch boundaries (lastApplied), which advances slower than the majority commit point, they seem to be much more unlikely to observe the same problem.

The test should not be making assertions about the entire collection's data, since that requires snapshot read isolation. Instead, we should modify the test to have weaker assertions.



 Comments   
Comment by Githook User [ 11/Sep/20 ]

Author:

{'name': 'Louis Williams', 'email': 'louis.williams@mongodb.com', 'username': 'louiswilliams'}

Message: SERVER-50610 secondary_reads.js should not make assertions based on natural collection ordering
Branch: master
https://github.com/mongodb/mongo/commit/7491889863ca960b6fa40a4aea99a049e96bfc85

Comment by Louis Williams [ 28/Aug/20 ]

After some discussion with daniel.gottlieb, I think we can fix the test by forcing the query to use the {x: 1 } index by building it on the correct collection (it needs to be built on 'this.collName').

This test can make the same assertion if it scans using an index that is ordered (either _id or x). The main problem is that this workload's assertions depend on "natural" ordering that is not guaranteed on secondaries.

Generated at Thu Feb 08 05:23:06 UTC 2024 using Jira 9.7.1#970001-sha1:2222b88b221c4928ef0de3161136cc90c8356a66.