[SERVER-23795] master/slave looks at on-disk size on a resync Created: 19/Apr/16  Updated: 22/Nov/16  Resolved: 03/Jun/16

Status: Closed
Project: Core Server
Component/s: Replication
Affects Version/s: None
Fix Version/s: 3.2.9, 3.3.8

Type: Task Priority: Minor - P4
Reporter: Keith Bostic (Inactive) Assignee: Michael Cahill (Inactive)
Resolution: Done Votes: 0
Labels: code-only
Remaining Estimate: Not Specified
Time Spent: Not Specified
Original Estimate: Not Specified

Issue Links:
Related
is related to SERVER-23526 Replication relies on storage engines... Closed
Backwards Compatibility: Fully Compatible
Backport Completed:
Participants:

 Description   

The WiredTiger inMemory storage engine originally returned zero as the on-disk size for collections.

The replication sub-system assumes on-disk sizes will be non-zero whenever there is content.

Further analysis from sue.loverso:

I think it is the primary looking at the on-disk size to determine what to send to a secondary on a resync. I believe the code in mongo/db/repl/master_slave.cpp:423:forceResync() is the code in question. The listDatabases() command in mongo/db/commands/list_databases.cpp adds an empty boolean of true if the size on disk returned is 0. Then the forceResync code calls resyncDrop if it is empty. There are no comments nearby for that decision.

There's currently a hack in WiredTiger (see __im_handle_size) to never return a storage size of 0.

It would be good to understand the MongoDB code, and if possible, remove the WiredTiger hack.



 Comments   
Comment by Githook User [ 27/Jul/16 ]

Author:

{u'username': u'michaelcahill', u'name': u'Michael Cahill', u'email': u'michael.cahill@mongodb.com'}

Message: SERVER-23795 Use storage engine isEmpty instead of storageSize.

(cherry picked from commit b730f98c9e6c26fe564c6a25612634058ae2dd22)
Branch: v3.2
https://github.com/mongodb/mongo/commit/f052ebda5648ba0f5dd792d86ea1b95f065c06c0

Comment by Githook User [ 03/Jun/16 ]

Author:

{u'username': u'michaelcahill', u'name': u'Michael Cahill', u'email': u'michael.cahill@mongodb.com'}

Message: SERVER-23795 Use storage engine isEmpty instead of storageSize.
Branch: master
https://github.com/mongodb/mongo/commit/b730f98c9e6c26fe564c6a25612634058ae2dd22

Comment by Eric Milkie [ 27/Apr/16 ]

That seems totally reasonable. if all the tests pass, that is sufficient proof for me that this change is good.

Comment by Michael Cahill (Inactive) [ 27/Apr/16 ]

milkie, maybe something like this is the answer?

--- a/src/mongo/db/commands/list_databases.cpp
+++ b/src/mongo/db/commands/list_databases.cpp
@@ -108,7 +108,7 @@ public:
                 b.append("sizeOnDisk", static_cast<double>(size));
                 totalSize += size;
 
-                b.appendBool("empty", size == 0);
+                b.appendBool("empty", entry->isEmpty());
             }
 
             dbInfos.push_back(b.obj());

In the case of WiredTiger / inMemory, databases will only be empty if they have no collections.

If this doesn't seem insane, let me know and I'll work up a patch and put it through testing.

Comment by Eric Milkie [ 20/Apr/16 ]

The code is in master/slave and I think it is simply attempting to skip databases with no content because they are "ghost" databases (since databases are implicitly created and used to be created just by trying to observe them, without any writes involved).
We can try just taking this logic out, and have master/slave replicate all databases returned by listDatabases, regardless of size.

Generated at Thu Feb 08 04:04:30 UTC 2024 using Jira 9.7.1#970001-sha1:2222b88b221c4928ef0de3161136cc90c8356a66.