[SERVER-23935] Disable oplog sampling in queryable backup mode Created: 26/Apr/16 Updated: 02/May/18 Resolved: 19/Oct/17 |
|
| Status: | Closed |
| Project: | Core Server |
| Component/s: | Storage |
| Affects Version/s: | 3.1.8 |
| Fix Version/s: | 3.4.15, 3.6.0-rc1 |
| Type: | Improvement | Priority: | Major - P3 |
| Reporter: | Alex Etling | Assignee: | Neha Khatri |
| Resolution: | Done | Votes: | 0 |
| Labels: | neweng | ||
| Remaining Estimate: | Not Specified | ||
| Time Spent: | Not Specified | ||
| Original Estimate: | Not Specified | ||
| Issue Links: |
|
||||
| Backwards Compatibility: | Fully Compatible | ||||
| Backport Requested: |
v3.4
|
||||
| Sprint: | Storage 2017-10-23 | ||||
| Participants: | |||||
| Case: | (copied to CRM) | ||||
| Description |
|
When running with --queryableBackupMode, WiredTigerKVEngine::initRsOplogBackgroundThread() should not start a thread, and therefore not perform oplog sampling. In addition, we should consider adding a separate configuration option to disable this oplog sampling without needing to be running with --queryableBackupMode. Original DescriptionIn version 3.1.8 mongo added a new feature: On startup WiredTiger now needs to read values from the oplog to figure out where to place the milestones:
This process is fine if all of the data is stored locally on disk. The problem is when you are using external storage (for us, Amazon EBS). The sampling from cold EBS is taking somewhere between 20 and 30 minutes leading to incredibly slow startup times. Is there a good way to speed up this process / bring startup times back to 3.1.7 levels for EBS backed instances? |
| Comments |
| Comment by Githook User [ 02/May/18 ] |
|
Author: {'email': 'neha.khatri@mongodb.com', 'name': 'nehakhatri5', 'username': 'nehakhatri5'}Message: In queryable backup mode the oplog truncation would never occur. Hence oplog sampling (cherry picked from commit 1c96c3561dda50fc3ba6d98decef1c0d3c9f60df) |
| Comment by Asya Kamsky [ 18/Mar/18 ] |
|
bartle I've requested a backport in BACKPORT-1888 - it will be considered by the team during the next triage session. |
| Comment by David Bartley [ 16/Mar/18 ] |
|
Would it be possible to backport this to 3.4? I believe it should be a fairly trivial backport. |
| Comment by Neha Khatri [ 19/Oct/17 ] |
|
A code change has been pushed in master to prohibit the oplog sampliing in queryableBackupMode. |
| Comment by Githook User [ 19/Oct/17 ] |
|
Author: {'email': 'neha.khatri@mongodb.com', 'name': 'nehakhatri5'}Message: In queryable backup mode the oplog truncation would never occur. Hence oplog sampling |
| Comment by Alex Etling [ 10/May/16 ] |
|
Max, As far as documentation goes, I think this might be a semi-common use case. It is probably worth documenting the `--queryableBackupMode` option and how it can be used to prevent sampling. Alex |
| Comment by Max Hirschhorn [ 10/May/16 ] |
|
I wanted to give you an update with regard to some of the internal discussion we've been having around this ticket. While I'm not familiar enough with cold EBS to estimate how long ~1000 random walks in a WiredTiger B-tree should take, 20-30 minutes seems like an undesirably long time.
Given that your use case is read-only in order to verify the snapshot, I agree it makes sense not to have to sample from the oplog on startup because a truncation of the oplog shouldn't ever occur. We are therefore going to transform this ticket into a feature request so that WiredTigerKVEngine::initRsOplogBackgroundThread() skips starting a thread and thus skips sampling the oplog when the undocumented --queryableBackupMode option is specified ( Let us know if you have any further questions or concerns. Thanks, |
| Comment by Ramon Fernandez Marina [ 10/May/16 ] |
|
Thanks for your report paetling@gmail.com, we're dispatching it internally for consideration. |
| Comment by Travis Thieman [ 26/Apr/16 ] |
|
To clarify, this is an issue primarily when attempting to start a mongod off of an uninitialized EBS volume, e.g. one that was just restored from a snapshot and has not yet read its data files in from S3. Details here: http://docs.aws.amazon.com/AWSEC2/latest/UserGuide/ebs-initialize.html On an already-initialized EBS volume, startup times are only a second or two slower than pre-3.1.7 speeds in our experience (50GB oplog). Our use case is attempting to verify that an EBS snapshot contains a valid Mongo data directory. To do this, we restore the snapshot to a volume, mount the volume, start a mongod against the volume's data directory, and attempt to query our collections. |