-
Type:
Bug
-
Resolution: Works as Designed
-
Priority:
Major - P3
-
None
-
Affects Version/s: None
-
Component/s: None
-
None
-
None
-
None
-
None
-
None
-
None
-
None
-
None
I'm currently trying to write a workload with the Python driver to analyze the performance of long-running transactions before and after the Storage Engines team's durable history project.
I've been struggling to keep a transaction running at the storage engine level and am seeing some differences between the way the Python driver operates and what I consider to be an equivalent Javascript function executed by the Mongo shell. When using the Python driver, I'm seeing rollbacks after each read.
Python code: https://github.com/tetsuo-cpp/py_lswa
import random import pymongo def scan(): client = pymongo.MongoClient(appname='Scanner') print('scan: Performing long-running scan') with client.start_session(causal_consistency=True) as session: with session.start_transaction(read_concern=pymongo.read_concern.ReadConcern('snapshot')): lswa_db = session.client.lswa_db lswa = session.client.lswa_db.lswa # Make it a multi-update transaction just to be sure... doc1 = { '_id': 1, 'contents': 'a', } doc2 = { '_id': 2, 'contents': 'a', } result = lswa.replace_one({'_id': 1}, doc1) assert result.matched_count == 1 result = lswa.replace_one({'_id': 2}, doc2) assert result.matched_count == 1 counter = 0 while True: id_val = random.randint(1, 10000) doc = lswa.find_one({'_id': id_val}) counter += 1 if counter % 1000 == 0: print(counter) # for val in lswa.find(): # # print('scan: Iterated {}'.format(val['contents'][0])) # pass # session.commit_transaction() print('scan: Finished scanning')
Javascript:
const session = db.getMongo().startSession({causalConsistency: true}); const sessionDb = session.getDatabase('lswa_db'); const sessionColl = sessionDb.getCollection('lswa'); session.startTransaction({readConcern: {level: "snapshot"}}); assert.commandWorked(sessionColl.update({_id: 1}, {_id: 1, contents: 'a'})); assert.commandWorked(sessionColl.update({_id: 2}, {_id: 2, contents: 'a'})); let cnt = 0; while (true) { id_val = Math.floor(2 + Math.random() * 9998); let doc = sessionColl.findOne({'_id': id_val}, {contents: 0}); if (++cnt % 1000 == 0) { print(cnt); } }
The effect is easy to see if I compile v4.2 with this diff:
Unstaged changes (1) modified src/mongo/db/storage/wiredtiger/wiredtiger_recovery_unit.cpp @@ -357,6 +357,10 @@ WiredTigerSession* WiredTigerRecoveryUnit::getSessionNoTxn() { void WiredTigerRecoveryUnit::abandonSnapshot() { invariant(!_inUnitOfWork(), toString(_state)); if (_isActive()) { + static int counter = 0; + if (++counter > 10000) { + invariant(false); + } // Can't be in a WriteUnitOfWork, so safe to rollback _txnClose(false); }
In the Python case, it'll crash shortly after the 10,000th key is read. The JS will keep going for much longer since all of the reads are running under a single transaction.