|
Author:
{u'username': u'ramonfm', u'name': u'Ramon Fernandez', u'email': u'ramon.fernandez@mongodb.com'}
Message: SERVER-18250 WiredTiger is not the default engine in v3.0
Branch: v3.0
https://github.com/mongodb/mongo/commit/3167b33df67688a6f92e4ead752ff7ec4426df42
|
|
Thanks sue.loverso, I've backported the change to 3.0.7 and will keep an eye for potential issues on Evergreen. There were some additional changes to wt_nojournal_toggle.js in SERVER-20176 that will also be backported soon.
|
|
Author:
{u'username': u'visemet', u'name': u'Max Hirschhorn', u'email': u'max.hirschhorn@mongodb.com'}
Message: SERVER-18250 Add tests for toggling between --journal and --nojournal.
(cherry picked from commit 87201c5db22901f103dc5e1de1dc3a67aab36257)
Branch: v3.0
https://github.com/mongodb/mongo/commit/8612f5af228ea853087a82453b14b81c629afb30
|
|
Author:
{u'username': u'sueloverso', u'name': u'Susan LoVerso', u'email': u'sue@wiredtiger.com'}
Message: SERVER-18250 On transition from journal to no-journal, run WT recovery.
(cherry picked from commit 8195bb49284a112b9fe620586f2986546c2b0390)
Branch: v3.0
https://github.com/mongodb/mongo/commit/6bb726540cde2cd0942bf3c9b08bea52355afbef
|
|
Yes, this should backport cleanly into the 3.0 code. The only change would be the global error message numbers.
|
|
Can you confirm that this fix has not been backported to 3.0?
|
|
Author:
{u'username': u'visemet', u'name': u'Max Hirschhorn', u'email': u'max.hirschhorn@mongodb.com'}
Message: SERVER-18250 Add tests for toggling between --journal and --nojournal.
Branch: master
https://github.com/mongodb/mongo/commit/87201c5db22901f103dc5e1de1dc3a67aab36257
|
|
Author:
{u'username': u'sueloverso', u'name': u'Susan LoVerso', u'email': u'sue@wiredtiger.com'}
Message: SERVER-18250 On transition from journal to no-journal, run WT recovery.
Branch: master
https://github.com/mongodb/mongo/commit/8195bb49284a112b9fe620586f2986546c2b0390
|
|
Perhaps a better test case to demonstrate my previous point about WiredTiger not replaying operations since the last checkpoint is the following. All that has been added is an fsync command when iter === 50 to ensure that at least one checkpoint is made.
(function() {
|
'use strict';
|
|
var dbpath = MongoRunner.dataPath + 'nojournal';
|
resetDbpath(dbpath);
|
|
var conn = MongoRunner.runMongod({
|
dbpath: dbpath,
|
noCleanData: true,
|
journal: '',
|
});
|
assert.neq(null, conn, 'mongod was unable to start up');
|
|
var coll = conn.getDB('test').nojournal;
|
|
var awaitShell = startParallelShell(function() {
|
for (var iter = 0; iter < 1000; ++iter) {
|
var bulk = db.nojournal.initializeUnorderedBulkOp();
|
for (var i = 0; i < 100; ++i) {
|
bulk.insert({a: i});
|
}
|
var bulkRes = bulk.execute();
|
var syncRes = db.nojournal.insert({b: iter}, {j: true});
|
if (iter < 100) {
|
assert.writeOK(bulkRes);
|
assert.writeOK(syncRes);
|
}
|
if (iter === 50) {
|
assert.commandWorked(db.adminCommand({fsync: 1}));
|
}
|
}
|
}, conn.port);
|
|
assert.soon(function() {
|
var count = coll.count({b: {$exists: true}});
|
if (count >= 100) {
|
print('[BEFORE] total number of documents is ' + coll.find().itcount());
|
print('[BEFORE] number of documents with b field is ' + coll.find({b: {$exists: true}}).itcount());
|
// Don't create a checkpoint before killing the mongod.
|
MongoRunner.stopMongod(conn, 9);
|
return true;
|
}
|
return false;
|
});
|
|
conn = MongoRunner.runMongod({
|
dbpath: dbpath,
|
noCleanData: true,
|
nojournal: '',
|
});
|
assert.neq(null, conn, 'mongod was unable to start up');
|
|
coll = conn.getDB('test').nojournal;
|
coll.validate(true);
|
|
print('[AFTER] total number of documents is ' + coll.find().itcount());
|
print('[AFTER] number of documents with b field is ' + coll.find({b: {$exists: true}}).itcount());
|
|
assert.lte(100, coll.count({b: {$exists: true}}));
|
assert.lte(100, coll.find({b: {$exists: true}}).itcount());
|
|
MongoRunner.stopMongod(conn);
|
})();
|
The insert of documents with the "b" field are journaled, but doing an unclean shutdown and restarting the mongod with --nojournal only has the documents from when the checkpoint was written. It does not seem to function as bruce.lucas@10gen.com had expected where recovery from the existing journal would be performed, and new operations on the system would not be journaled.
m27000| 2015-06-22T16:16:36.067-0400 I COMMAND [conn2] CMD fsync: sync:1 lock:0
|
[BEFORE] total number of documents is 10605
|
[BEFORE] number of documents with b field is 124
|
...
|
m27000| 2015-06-22T16:16:42.420-0400 I COMMAND [conn1] CMD: validate test.nojournal
|
m27000| 2015-06-22T16:16:42.423-0400 I STORAGE [conn1] WiredTiger progress WT_SESSION.verify 7
|
m27000| 2015-06-22T16:16:42.432-0400 I INDEX [conn1] validating index test.nojournal.$_id_
|
m27000| 2015-06-22T16:16:42.433-0400 I STORAGE [conn1] WiredTiger progress WT_SESSION.verify 5
|
[AFTER] total number of documents is 5151
|
[AFTER] number of documents with b field is 51
|
|
|
Hi sue,
I applied the following change to make it so that we explicitly do not enable write-ahead logging when --nojournal is specified. The other settings are passed to wiredtiger_open() so that the journal can still be read to do recovery like bruce.lucas@10gen.com proposed.
diff --git a/src/mongo/db/storage/wiredtiger/wiredtiger_kv_engine.cpp b/src/mongo/db/storage/wiredtiger/wiredtiger_kv_engine.cpp
|
index 291797c..e00c18a 100644
|
--- a/src/mongo/db/storage/wiredtiger/wiredtiger_kv_engine.cpp
|
+++ b/src/mongo/db/storage/wiredtiger/wiredtiger_kv_engine.cpp
|
@@ -104,6 +104,9 @@ WiredTigerKVEngine::WiredTigerKVEngine(const std::string& path,
|
if (_durable) {
|
ss << "log=(enabled=true,archive=true,path=journal,compressor=";
|
ss << wiredTigerGlobalOptions.journalCompressor << "),";
|
+ } else {
|
+ ss << "log=(enabled=false,archive=true,path=journal,compressor=";
|
+ ss << wiredTigerGlobalOptions.journalCompressor << "),";
|
}
|
ss << "file_manager=(close_idle_time=100000),"; //~28 hours, will put better fix in 3.1.x
|
ss << "checkpoint=(wait=" << wiredTigerGlobalOptions.checkpointDelaySecs;
|
I then ran the following test that does some unjournaled and journaled writes, creates a checkpoint, and then performs an unclean shutdown by sending SIGKILL to the mongod.
(function() {
|
'use strict';
|
|
var dbpath = MongoRunner.dataPath + 'nojournal';
|
resetDbpath(dbpath);
|
|
var conn = MongoRunner.runMongod({
|
dbpath: dbpath,
|
noCleanData: true,
|
journal: '',
|
});
|
assert.neq(null, conn, 'mongod was unable to start up');
|
|
var coll = conn.getDB('test').nojournal;
|
|
var awaitShell = startParallelShell(function() {
|
for (var iter = 0; iter < 1000; ++iter) {
|
var bulk = db.nojournal.initializeUnorderedBulkOp();
|
for (var i = 0; i < 100; ++i) {
|
bulk.insert({a: i});
|
}
|
var bulkRes = bulk.execute();
|
var syncRes = db.nojournal.insert({b: iter}, {j: true});
|
if (iter < 100) {
|
assert.writeOK(bulkRes);
|
assert.writeOK(syncRes);
|
}
|
}
|
}, conn.port);
|
|
assert.soon(function() {
|
var count = coll.count({b: {$exists: true}});
|
if (count >= 100) {
|
// Don't create a checkpoint before killing the mongod.
|
MongoRunner.stopMongod(conn, 9);
|
return true;
|
}
|
return false;
|
});
|
|
conn = MongoRunner.runMongod({
|
dbpath: dbpath,
|
noCleanData: true,
|
nojournal: '',
|
});
|
assert.neq(null, conn, 'mongod was unable to start up');
|
|
coll = conn.getDB('test').nojournal;
|
coll.validate(true);
|
|
print('list of collection names: ' + tojson(conn.getDB('test').getCollectionNames()));
|
print('total number of documents is ' + coll.find().itcount());
|
print('number of documents with b field is ' + coll.find({b: {$exists: true}}).itcount());
|
|
assert.lte(100, coll.count({b: {$exists: true}}));
|
assert.lte(100, coll.find({b: {$exists: true}}).itcount());
|
|
MongoRunner.stopMongod(conn);
|
})();
|
From the output I see that we rename _mdb_catalog.wt to _mdb_catalog.wt.1 and create a new KVCatalog on startup. This means that I can no longer access the data I had created when journaling was enabled if I start the mongod up with --nojournal after an unclean shutdown.
m27000| 2015-06-22T10:50:05.649-0400 W - [initandlisten] Detected unclean shutdown - /data/db/job0/mongorunner/nojournal/mongod.lock is not empty.
|
m27000| 2015-06-22T10:50:05.649-0400 W STORAGE [initandlisten] Recovering data from the last clean checkpoint.
|
m27000| 2015-06-22T10:50:05.649-0400 I STORAGE [initandlisten] wiredtiger_open config: create,cache_size=15G,session_max=20000,eviction=(threads_max=4),statistics=(fast),log=(enabled=false,archive=true,path=journal,compressor=snappy),file_manager=(close_idle_time=100000),checkpoint=(wait=60,log_size=2GB),statistics_log=(wait=0),
|
m27000| 2015-06-22T10:50:05.659-0400 E STORAGE [initandlisten] WiredTiger (17) [1434984605:659472][20475:0x7f935e489b80], WT_SESSION.create: /data/db/job0/mongorunner/nojournal/sizeStorer.wt: File exists
|
m27000| 2015-06-22T10:50:05.659-0400 I STORAGE [initandlisten] WiredTiger unexpected file sizeStorer.wt found, renamed to sizeStorer.wt.1
|
m27000| 2015-06-22T10:50:05.664-0400 E STORAGE [initandlisten] WiredTiger (17) [1434984605:664016][20475:0x7f935e489b80], WT_SESSION.create: /data/db/job0/mongorunner/nojournal/_mdb_catalog.wt: File exists
|
m27000| 2015-06-22T10:50:05.664-0400 I STORAGE [initandlisten] WiredTiger unexpected file _mdb_catalog.wt found, renamed to _mdb_catalog.wt.1
|
m27000| 2015-06-22T10:50:05.671-0400 I CONTROL [initandlisten] MongoDB starting : pid=20475 port=27000 dbpath=/data/db/job0/mongorunner/nojournal 64-bit host=hanamizu
|
...
|
list of collection names: [ ]
|
total number of documents is 0
|
number of documents with b field is 0
|
If I start the mongod with journaling enabled, or do a clean shutdown, then my data is still accessible. Are there changes on the WiredTiger side in order to get disabling journaling after an unclean shutdown to work?
CC dan@10gen.com
|
|
Thanks for verifying. Will schedule for the integration team
|
|
bruce.lucas@mongodb.com Your analysis is correct. If the config string does not specify that logging is off, it will assume the default (on creation), or the saved previous value (on existing database). So in your case it is retrieving that logging was enabled from the base configuration file.
We do have code that handles and tests toggling the logging setting (motivated by SERVER-16692). I will verify the unclean shutdown behavior but I think it does work as you expect. (The test does clean shutdowns with toggles.)
|
|
This may be partly an integration layer issue. With journaling enabled the integration layer is passing the following config string:
create,cache_size=8G,session_max=20000,eviction=(threads_max=4),statistics=(fast),log=(enabled=true,archive=true,path=journal,compressor=snappy),checkpoint=(wait=60,log_size=2GB),statistics_log=(wait=0),
|
with --nojournal it is passing the following string:
create,cache_size=8G,session_max=20000,eviction=(threads_max=4),statistics=(fast),checkpoint=(wait=60,log_size=2GB),statistics_log=(wait=0),
|
I would have expected it to include something like "log=(enabled=false,...)". However one issue to consider is what to do if there was an unclean shutdown so recovery is needed but --nojournal has been passed. Ideally I would expect in that case for it to run recovery from the existing journal, but not to do any further journaling, I think.
|
Generated at Thu Feb 08 03:47:04 UTC 2024 using Jira 9.7.1#970001-sha1:2222b88b221c4928ef0de3161136cc90c8356a66.