-
Type:
Bug
-
Resolution: Works as Designed
-
Priority:
Major - P3
-
None
-
Affects Version/s: None
-
Component/s: None
-
None
-
None
-
None
-
None
-
None
-
None
-
None
-
None
I'm currently trying to write a workload with the Python driver to analyze the performance of long-running transactions before and after the Storage Engines team's durable history project.
I've been struggling to keep a transaction running at the storage engine level and am seeing some differences between the way the Python driver operates and what I consider to be an equivalent Javascript function executed by the Mongo shell. When using the Python driver, I'm seeing rollbacks after each read.
Python code: https://github.com/tetsuo-cpp/py_lswa
import random
import pymongo
def scan():
client = pymongo.MongoClient(appname='Scanner')
print('scan: Performing long-running scan')
with client.start_session(causal_consistency=True) as session:
with session.start_transaction(read_concern=pymongo.read_concern.ReadConcern('snapshot')):
lswa_db = session.client.lswa_db
lswa = session.client.lswa_db.lswa
# Make it a multi-update transaction just to be sure...
doc1 = {
'_id': 1,
'contents': 'a',
}
doc2 = {
'_id': 2,
'contents': 'a',
}
result = lswa.replace_one({'_id': 1}, doc1)
assert result.matched_count == 1
result = lswa.replace_one({'_id': 2}, doc2)
assert result.matched_count == 1
counter = 0
while True:
id_val = random.randint(1, 10000)
doc = lswa.find_one({'_id': id_val})
counter += 1
if counter % 1000 == 0:
print(counter)
# for val in lswa.find():
# # print('scan: Iterated {}'.format(val['contents'][0]))
# pass
# session.commit_transaction()
print('scan: Finished scanning')
Javascript:
const session = db.getMongo().startSession({causalConsistency: true});
const sessionDb = session.getDatabase('lswa_db');
const sessionColl = sessionDb.getCollection('lswa');
session.startTransaction({readConcern: {level: "snapshot"}});
assert.commandWorked(sessionColl.update({_id: 1}, {_id: 1, contents: 'a'}));
assert.commandWorked(sessionColl.update({_id: 2}, {_id: 2, contents: 'a'}));
let cnt = 0;
while (true) {
id_val = Math.floor(2 + Math.random() * 9998);
let doc = sessionColl.findOne({'_id': id_val}, {contents: 0});
if (++cnt % 1000 == 0) {
print(cnt);
}
}
The effect is easy to see if I compile v4.2 with this diff:
Unstaged changes (1)
modified src/mongo/db/storage/wiredtiger/wiredtiger_recovery_unit.cpp
@@ -357,6 +357,10 @@ WiredTigerSession* WiredTigerRecoveryUnit::getSessionNoTxn() {
void WiredTigerRecoveryUnit::abandonSnapshot() {
invariant(!_inUnitOfWork(), toString(_state));
if (_isActive()) {
+ static int counter = 0;
+ if (++counter > 10000) {
+ invariant(false);
+ }
// Can't be in a WriteUnitOfWork, so safe to rollback
_txnClose(false);
}
In the Python case, it'll crash shortly after the 10,000th key is read. The JS will keep going for much longer since all of the reads are running under a single transaction.