[SERVER-31767] Provide a window of snapshot history that is accessible for PIT reads Created: 30/Oct/17  Updated: 30/Oct/23  Resolved: 25/May/18

Status: Closed
Project: Core Server
Component/s: Storage
Affects Version/s: None
Fix Version/s: 4.1.1

Type: Task Priority: Major - P3
Reporter: Michael Cahill (Inactive) Assignee: Dianna Hohensee (Inactive)
Resolution: Fixed Votes: 0
Labels: todo_in_code
Remaining Estimate: Not Specified
Time Spent: Not Specified
Original Estimate: Not Specified

Issue Links:
Depends
depends on WT-3974 Add a measurement of cache pressure c... Closed
is depended on by SERVER-34136 Add multi-shard integration tests for... Closed
is depended on by SERVER-34436 Workloads to apply transaction snapsh... Closed
is depended on by SERVER-35023 Add server parameter maxTransactionLo... Closed
Related
related to SERVER-43511 Validate if TODO listed in SERVER-317... Closed
is related to WT-4101 Don't abort the eviction server durin... Closed
is related to SERVER-35114 Make it possible to adjust the period... Closed
is related to SERVER-35251 Set the snapshot window default size ... Closed
Backwards Compatibility: Fully Compatible
Sprint: Storage NYC 2018-04-09, Storage NYC 2018-04-23, Storage NYC 2018-05-07, Storage NYC 2018-05-21, Storage NYC 2018-06-04
Participants:

 Description   

In 3.6, we permit reads back to the majority commit point (approximately: a little additional slack is currently built in). Attempts to read at an older point will fail.

For Global Point in Time Reads, a mongos will need to establish a common timestamp across all mongod nodes involved in a query. In order to do that, it would be helpful for each node to store some additional history.

WiredTiger could monitor when it is under cache pressure from storing history, and update the oldest_timestamp in response, with the goal of allowing reads older than the majority commit point where possible without impacting performance.

We should also consider whether to make the "comfort level" configurable, to maintain some history even on memory constrained nodes to increase the chance of finding a common point in time across a cluster.



 Comments   
Comment by Githook User [ 25/May/18 ]

Author:

{'username': 'DiannaHohensee', 'name': 'Dianna Hohensee', 'email': 'dianna.hohensee@10gen.com'}

Message: SERVER-31767 Provide a window of snapshot history that is accessible for PIT reads
Branch: master
https://github.com/mongodb/mongo/commit/136178a12e6b4fe41b6be047ee04db60a69af33e

Comment by Dianna Hohensee (Inactive) [ 26/Apr/18 ]

bruce.lucas This is the layout I'm planning to add the serverStatus.wiredtiger output, per our conversation. If you have any comments for clarity or requests, let me know.

"snapshot-window-settings" : {
     "cache pressure percentage threshold" : <num>,
     "current cache pressure percentage" : <num>,
     "max available snapshots window size in seconds" : <num>,
     "target available snapshots window size in seconds" : <num>,
     "current available snapshots window size in seconds" : <num>,
     "latest majority snapshot timestamp available" : <num>,
     "oldest majority snapshot timestamp available" : <num>
}

Comment by Alexander Gorrod [ 29/Mar/18 ]

I believe you can retrieve the relevant information from WiredTiger via querying an existing statistic. I've written a possible implementation:

--- a/src/mongo/db/storage/wiredtiger/wiredtiger_record_store.cpp
+++ b/src/mongo/db/storage/wiredtiger/wiredtiger_record_store.cpp
@@ -774,6 +774,19 @@ int64_t WiredTigerRecordStore::storageSize(OperationContext* opCtx,
     return size;
 }
 
+bool WiredTigerRecordStore::isCacheUnderPressure(OperationContext* opCtx) const {
+    WiredTigerSession* session = WiredTigerRecoveryUnit::get(opCtx)->getSessionNoTxn();
+    StatusWith<int64_t> result =
+        WiredTigerUtil::getStatisticsValueAs<int64_t>(session->getSession(),
+                                                      "statistics:", NULL,
+                                                      WT_STAT_CONN_CACHE_LOOKASIDE_SCORE);
+    uassertStatusOK(result.getStatus());
+
+    int64_t score = result.getValue();
+
+    return (score > 50);
+}
+
 // Retrieve the value from a positioned cursor.
 RecordData WiredTigerRecordStore::_getData(const WiredTigerCursor& cursor) const {
     WT_ITEM value;
diff --git a/src/mongo/db/storage/wiredtiger/wiredtiger_record_store.h b/src/mongo/db/storage/wiredtiger/wiredtiger_record_store.h
index ddec68527c..785c96d95a 100644
--- a/src/mongo/db/storage/wiredtiger/wiredtiger_record_store.h
+++ b/src/mongo/db/storage/wiredtiger/wiredtiger_record_store.h
@@ -127,6 +127,8 @@ public:
 
     virtual bool isCapped() const;
 
+    virtual bool isCacheUnderPressure(OperationContext* opCtx) const;
+
     virtual int64_t storageSize(OperationContext* opCtx,
                                 BSONObjBuilder* extraInfo = NULL,
                                 int infoLevel = 0) const;

The lookaside score is a value between 0 and 100, if there is no cache pressure due to history then the statistic will return a value of 0, as the amount of cache pressure due to history increases the score will increase towards the cap of 100. The above code suggests using a value of 50 as a reasonable threshold to base the decision on - I believe that's right, but don't have concrete data to back it up.

Comment by Alexander Gorrod [ 13/Mar/18 ]

The majority of the work here should be done in WT-3974, but it is likely a configuration setting will need to be changed in order to access the new functionality - and that change should be made under this ticket.

Comment by Ian Whalen (Inactive) [ 12/Mar/18 ]

alexander.gorrod sounds like there is a dependency on some WT work to produce a callback when there is cache pressure - can you either link that ticket here or file and link if it doesn't exist yet?

Generated at Thu Feb 08 04:28:07 UTC 2024 using Jira 9.7.1#970001-sha1:2222b88b221c4928ef0de3161136cc90c8356a66.