[SERVER-66974] Server responding with unexpected error 246 (SnapshotUnavailable) Created: 02/Jun/22  Updated: 27/Oct/23  Resolved: 03/Jun/22

Status: Closed
Project: Core Server
Component/s: None
Affects Version/s: None
Fix Version/s: None

Type: Bug Priority: Major - P3
Reporter: Jeffrey Yemin Assignee: Daniel Gottlieb (Inactive)
Resolution: Works as Designed Votes: 0
Labels: None
Remaining Estimate: Not Specified
Time Spent: Not Specified
Original Estimate: Not Specified
Environment:
  • rhel70-small
  • 6.0.0-rc8 server
  • three node replica set

Issue Links:
Related
is related to SERVER-63887 SnapshotUnavailable error on sharded ... Closed
is related to JAVA-4637 failed: MongoClientSessionSpecificati... Closed
Operating System: ALL
Participants:

 Description   

The Java driver has three similar tests (for legacy, sync, and async APIs) that all occasionally fail with the server error:

{
   "ok": 0.0, 
   "code": 246, 
   "codeName": "SnapshotUnavailable"
   "errorLabels": ["TransientTransactionError"], 
   "errmsg": "Unable to read from a snapshot due to pending collection catalog changes; please retry the operation. Snapshot timestamp is Timestamp(1654118095, 84). Collection minimum is Timestamp(1654118095, 86)", 
}

The tests all do the following:

  1. drop the collection
  2. insert a document into the collection
  3. insert a document into the collection within a transaction

Expected results: the insert within a transaction succeeds
Action results: the above error is returned by the server

Questions:

  1. What is the root cause of that error?
  2. Is there any way to change the test in order to stop the error from occurring?

Linked BF ticket for Java driver: JAVA-4637



 Comments   
Comment by Daniel Gottlieb (Inactive) [ 03/Jun/22 ]

Feel free to keep this ticket open if there is anything that you think could be improved on the server to improve the user experience (while unacknowledged writes are (hopefully) a rare thing in practice, there may be other situations where normal concurrent writes could encounter this error).

Unfortunately the server behavior is unrelated to the requests write concern – the ingredients to hit this state is more pervasive for real users than is being suggested. Anything that writes to the oplog creates "holes" which can result in SnapshotUnavailable for other clients. Even if the collection namespaces are unrelated. Enabling causal consistency is the recommended solution for anyone using transactions.

Comment by Jeffrey Yemin [ 03/Jun/22 ]

That's a great analysis, and it points me towards what the root cause is. While I claimed that there are no concurrent tests, in reality there is a sneaky driver feature that can introduce concurrency: unacknowledged (w: 0) writes. And in fact there is another test in the same class, that uses the same collection, which does in fact execute an unacknowledged write and doesn't have a mechanism to ensure that the server actually applied that write before the end of the test. I think I can work around that now that I understand it better.

Feel free to keep this ticket open if there is anything that you think could be improved on the server to improve the user experience (while unacknowledged writes are (hopefully) a rare thing in practice, there may be other situations where normal concurrent writes could encounter this error).

Comment by Daniel Gottlieb (Inactive) [ 03/Jun/22 ]

There are no concurrent tests running.

I can see in the logs something concurrent is happening – maybe not another test. At the time of the error there are two connections logging messages. conn4629 which gives the error and conn6231 which is making this collection:

2022-06-01T21:14:55.986+00:00 I  STORAGE  [conn6231] createCollection {"namespace":"JavaDriverTest.com.mongodb.client.MongoClientSessionSpecification","uuidDisposition":"generated","uuid":{"uuid":{"$uuid":"775ae530-7ff1-4d42-adb1-fc2ca98b6a7a"}},"options":{}}

I can see creating that collection lost the race in a write-write conflict:

2022-06-01T21:14:56.019+00:00 I  STORAGE  [conn6231] Conflicted registering namespace, already have a collection with the same namespace {"nss":"JavaDriverTest.com.mongodb.client.MongoClientSessionSpecification"}

That namespace is coincidentally the same one conn4629 is performing an insert against.

The SnapshotUnavailable log line you have sits between those two log lines. I hypothesize that if we had the data files for this test case we would see that Timestamp(1654118095, 85) does not exist in the oplog. I believe the race that resulted in SnapshotUnavailable was:

| conn4629                                                          | conn6231                           |
|-------------------------------------------------------------------+------------------------------------|
|                                                                   | BeginTxn                           |
|                                                                   | Allocate Timestamp(1654118095, 85) |
|                                                                   | CreateCollection X                 |
| BeginTxn                                                          |                                    |
| Allocate Timestamp(1654118095, 86)                                |                                    |
| CreateCollection X                                                |                                    |
| Commit (OK)                                                       |                                    |
|                                                                   |                                    |
| BeginTxn :readTimestamp = all_durable = Timestamp(1654118095, 84) |                                    |
| Insert(X, [documents]) -> SnapshotUnavailable                     |                                    |
|                                                                   | Commit (WriteConflict)             |

To summarize: a concurrent writer from somewhere opened up a hole. Removing the concurrent writer would have fixed this case, but is not a robust solution (for example, the noop writer in the server can cause this problem). Using causal consistency is a robust solution. This would resolve to the following interleaving in the server:

| conn4629                                               | conn6231                           |
|--------------------------------------------------------+------------------------------------|
|                                                        | BeginTxn                           |
|                                                        | Allocate Timestamp(1654118095, 85) |
|                                                        | CreateCollection X                 |
| BeginTxn                                               |                                    |
| Allocate Timestamp(1654118095, 86)                     |                                    |
| CreateCollection X                                     |                                    |
| Commit (OK)                                            |                                    |
|                                                        |                                    |
| afterClusterTime Timestamp(1654118095, 86) -- Blocking |                                    |
|                                                        | Commit (WriteConflict)             |
| BeginTxn :readTimestamp = Timestamp(1654118095, 86)    |                                    |
| Insert(X, [documents])                                 |                                    |
| Commit (OK)                                            |                                    |

Comment by Jeffrey Yemin [ 02/Jun/22 ]

I think this is the log message that is relevant:

{"t":{"$date":"2022-06-01T21:14:56.015+00:00"},"s":"I",  "c":"COMMAND",  "id":51803,   "ctx":"conn4629","msg":"Slow query","attr":{"type":"command","ns":"JavaDriverTest.com.mongodb.client.MongoClientSessionSpecification","command":{"insert":"com.mongodb.client.MongoClientSessionSpecification","ordered":true,"$db":"JavaDriverTest","$clusterTime":{"clusterTime":{"$timestamp":{"t":1654118096,"i":1}},"signature":{"hash":{"$binary":{"base64":"AAAAAAAAAAAAAAAAAAAAAAAAAAA=","subType":"0"}},"keyId":0}},"lsid":{"id":{"$uuid":"e31e6e8e-c978-4651-9434-b2c69b2f3c43"}},"txnNumber":123,"startTransaction":true,"autocommit":false},"ninserted":0,"numYields":0,"ok":0,"errMsg":"Unable to read from a snapshot due to pending collection catalog changes; please retry the operation. Snapshot timestamp is Timestamp(1654118095, 84). Collection minimum is Timestamp(1654118095, 86)","errName":"SnapshotUnavailable","errCode":246,"reslen":469,"locks":{"FeatureCompatibilityVersion":{"acquireCount":{"w":1}},"ReplicationStateTransition":{"acquireCount":{"w":3}},"Global":{"acquireCount":{"w":1}},"Database":{"acquireCount":{"w":1}},"Collection":{"acquireCount":{"w":1}},"Mutex":{"acquireCount":{"r":1}}},"flowControl":{"acquireCount":1,"timeAcquiringMicros":1},"readConcern":{"level":"local","provenance":"implicitDefault"},"storage":{},"remote":"127.0.0.1:49394","protocol":"op_msg","durationMillis":0}}

It's in .evergreen/orchestration/db/mongod-jonfftty/mongod.log.

The others are indeed red herrings due to our extensive use of failpoints for testing. 

Comment by Daniel Gottlieb (Inactive) [ 02/Jun/22 ]

Re: logs, there are multiple mongod logs – not sure which one is relevant. Searching for SnapshotUnavailable gives me something I'm not sure how to interpret:

{"t":{"$date":"2022-06-01T21:13:43.846+00:00"},"s":"D1", "c":"ASSERT",   "id":23074,   "ctx":"conn4608","msg":"User assertion","attr":{"error":"SnapshotUnavailable: Failing command via 'failCommand' failpoint","file":"src/mongo/db/commands.cpp","line":762}}

I assume I'm just seeing a red herring looking at the wrong logs/time period?

Comment by Daniel Gottlieb (Inactive) [ 02/Jun/22 ]

What is the root cause of that error?

Fundamentally the problem arises because we don't have perfect detail on historical information on collection state. Reading at a timestamp (read: "in the past") may discover, for example, that we don't know which indexes existed at that time and were fully built/consistent. This barrier also applies to collection creation.

Is there any way to change the test in order to stop the error from occurring?

A quick thing to try is turning on causal consistency. Assuming that still meets the goal of whats being tested.

I'm assuming the operation time in the response from the "create collection" request matches the timestamp value in Collection minimum is Timestamp(...). Using causal consistency/afterClusterTime will have the server internally wait, avoiding the SnapshotUnavailable error. I'm surprised this would have started recently. Which might be something worth server understanding and making sure we're behaving as expected.

Comment by Jeffrey Yemin [ 02/Jun/22 ]

Also, if you follow the links from the JAVA BF ticket, you'll find the full server logs attached to the task.

Comment by Jeffrey Yemin [ 02/Jun/22 ]

Here are (roughly) the raw commands (obtained from driver logs):

{"drop": "snapshot-test", "$db": "test", "lsid": {"id": {"$binary": {"base64": "OwX48zMHTwGA26ve43Jfag==", "subType": "04"}}}}
 
{"insert": "snapshot-test", "ordered": true, "$db": "test", "lsid": {"id": {"$binary": {"base64": "OwX48zMHTwGA26ve43Jfag==", "subType": "04"}}}, "documents": [{"_id": {"$oid": "6299048f064a9411875f4f43"}}]}
 
{"insert": "snapshot-test", "ordered": true, "$db": "test", "lsid": {"id": {"$binary": {"base64": "OwX48zMHTwGA26ve43Jfag==", "subType": "04"}}}, "txnNumber": 1, "startTransaction": true, "autocommit": false, "documents": [{"_id": {"$oid": "6299048f064a9411875f4f44"}}]}

There are no concurrent tests running.

Comment by Daniel Gottlieb (Inactive) [ 02/Jun/22 ]

Hey Jeff, mind adding details around using causal consistency and/or any write concern/read concerns send along with each of the requests? Though I think RC is not technically relevant.

Also can you confirm if there are other concurrent test threads performing writes against the mongod?

Generated at Thu Feb 08 06:06:56 UTC 2024 using Jira 9.7.1#970001-sha1:2222b88b221c4928ef0de3161136cc90c8356a66.