[SERVER-30909] readConcern afterClusterTime not working in a non-sharded replica set Created: 31/Aug/17  Updated: 06/Dec/22  Resolved: 08/Sep/17

Status: Closed
Project: Core Server
Component/s: Sharding
Affects Version/s: None
Fix Version/s: None

Type: Bug Priority: Major - P3
Reporter: Jeffrey Yemin Assignee: [DO NOT USE] Backlog - Sharding Team
Resolution: Duplicate Votes: 0
Labels: None
Remaining Estimate: Not Specified
Time Spent: Not Specified
Original Estimate: Not Specified

Issue Links:
Duplicate
duplicates SERVER-30977 Need to sign cluster times for unshar... Closed
Assigned Teams:
Sharding
Operating System: ALL
Steps To Reproduce:

1. Create a standard replica set
2. Insert a document and record the operationTime from the response
3. Attempt to find that document on a secondary with a readConcern containing afterClusterTime whose value is the operationTime from the insert response

Expected results: the find command succeeds and returns the requested document

Actual results: the secondary returns this error: readConcern afterClusterTime must not be greater than clusterTime value", "code" : 72, "codeName" : "InvalidOptions"

Participants:

 Description   

According to the design doc for causal consistency a non-sharded replica set should still honor afterClusterTime even in the absence of $clusterTime gossiping.

But in read_concern.cpp there is a check that the logical clock has advanced to afterClusterTime which is performed before waiting for that cluster time to replicate through the oplog.

The result is that readConcern.afterClusterTime fails consistently on secondary members of replica sets that are not part of a sharded cluster, but succeeds when part of a sharded cluster.



 Comments   
Comment by Jeffrey Yemin [ 31/Aug/17 ]

Here's logging of one failure case:

  Server: localhost:27017
  Start: 08:18:39.500
  Command: { "insert" : "causal", "ordered" : true, "documents" : [{ "_id" : 580 }], "$db" : "test", "lsid" : { "id" : { "$binary" : "UJp6HUyiSEWnCEz2XAtyxA==", "$type" : "04" } } }
  End: 08:18:39.500
  Response: { "n" : 1, "opTime" : { "ts" : { "$timestamp" : { "t" : 1504181919, "i" : 9 } }, "t" : { "$numberLong" : "66" } }, "electionId" : { "$oid" : "7fffffff0000000000000042" }, "ok" : 1.0, "operationTime" : { "$timestamp" : { "t" : 1504181919, "i" : 9 } } }
 
  Server: localhost:27018
  Start: 08:18:39.553
  Command: { "find" : "causal", "filter" : { "_id" : 580 }, "limit" : 1, "singleBatch" : true, "readConcern" : { "level" : "local", "afterClusterTime" : { "$timestamp" : { "t" : 1504181919, "i" : 9 } } }, "$db" : "test", "lsid" : { "id" : { "$binary" : "UJp6HUyiSEWnCEz2XAtyxA==", "$type" : "04" } }, "$readPreference" : { "mode" : "secondary" } }
  Failure: Command failed with error 72: 'readConcern afterClusterTime must not be greater than clusterTime value' on server jeff.fios-router.home:27018. The full response is { "ok" : 0.0, "errmsg" : "readConcern afterClusterTime must not be greater than clusterTime value", "code" : 72, "codeName" : "InvalidOptions" }

Generated at Thu Feb 08 04:25:25 UTC 2024 using Jira 9.7.1#970001-sha1:2222b88b221c4928ef0de3161136cc90c8356a66.