[SERVER-23935] Disable oplog sampling in queryable backup mode Created: 26/Apr/16  Updated: 02/May/18  Resolved: 19/Oct/17

Status: Closed
Project: Core Server
Component/s: Storage
Affects Version/s: 3.1.8
Fix Version/s: 3.4.15, 3.6.0-rc1

Type: Improvement Priority: Major - P3
Reporter: Alex Etling Assignee: Neha Khatri
Resolution: Done Votes: 0
Labels: neweng
Remaining Estimate: Not Specified
Time Spent: Not Specified
Original Estimate: Not Specified

Issue Links:
Backports
Backwards Compatibility: Fully Compatible
Backport Requested:
v3.4
Sprint: Storage 2017-10-23
Participants:
Case:

 Description   

When running with --queryableBackupMode, WiredTigerKVEngine::initRsOplogBackgroundThread() should not start a thread, and therefore not perform oplog sampling.

In addition, we should consider adding a separate configuration option to disable this oplog sampling without needing to be running with --queryableBackupMode.

Original Description

In version 3.1.8 mongo added a new feature: SERVER-19551

On startup WiredTiger now needs to read values from the oplog to figure out where to place the milestones:

2016-04-26T15:42:22.066+0000 I STORAGE  [initandlisten] The size storer reports that the oplog contains 19279531 records totaling to 53948096450 bytes
2016-04-26T15:42:22.410+0000 I STORAGE  [initandlisten] Sampling from the oplog between Apr 23 22:02:46:71 and Apr 26 14:10:01:d1 to determine where to place markers for truncation
2016-04-26T15:42:22.410+0000 I STORAGE  [initandlisten] Taking 1004 samples and assuming that each section of oplog contains approximately 191863 records totaling to 536872169 bytes

This process is fine if all of the data is stored locally on disk. The problem is when you are using external storage (for us, Amazon EBS). The sampling from cold EBS is taking somewhere between 20 and 30 minutes leading to incredibly slow startup times.

Is there a good way to speed up this process / bring startup times back to 3.1.7 levels for EBS backed instances?



 Comments   
Comment by Githook User [ 02/May/18 ]

Author:

{'email': 'neha.khatri@mongodb.com', 'name': 'nehakhatri5', 'username': 'nehakhatri5'}

Message: SERVER-23935 Disable oplog sampling in queryable backup mode

In queryable backup mode the oplog truncation would never occur. Hence oplog sampling
is disabled this mode.

(cherry picked from commit 1c96c3561dda50fc3ba6d98decef1c0d3c9f60df)
Branch: v3.4
https://github.com/mongodb/mongo/commit/ff63eee6bfba4d0cd4c1eaf1a95e5ada19f26afa

Comment by Asya Kamsky [ 18/Mar/18 ]

bartle I've requested a backport in BACKPORT-1888 - it will be considered by the team during the next triage session.

Comment by David Bartley [ 16/Mar/18 ]

Would it be possible to backport this to 3.4? I believe it should be a fairly trivial backport.

Comment by Neha Khatri [ 19/Oct/17 ]

A code change has been pushed in master to prohibit the oplog sampliing in queryableBackupMode.

Comment by Githook User [ 19/Oct/17 ]

Author:

{'email': 'neha.khatri@mongodb.com', 'name': 'nehakhatri5'}

Message: SERVER-23935 Disable oplog sampling in queryable backup mode

In queryable backup mode the oplog truncation would never occur. Hence oplog sampling
is disabled this mode.
Branch: master
https://github.com/mongodb/mongo/commit/1c96c3561dda50fc3ba6d98decef1c0d3c9f60df

Comment by Alex Etling [ 10/May/16 ]

Max,
This is awesome! I look forward to when this is implemented.

As far as documentation goes, I think this might be a semi-common use case. It is probably worth documenting the `--queryableBackupMode` option and how it can be used to prevent sampling.

Alex

Comment by Max Hirschhorn [ 10/May/16 ]

Hi paetling@gmail.com,

I wanted to give you an update with regard to some of the internal discussion we've been having around this ticket. While I'm not familiar enough with cold EBS to estimate how long ~1000 random walks in a WiredTiger B-tree should take, 20-30 minutes seems like an undesirably long time.

SERVER-20368 would have made it possible to avoid sampling the oplog when the mongod is started up as a stand-alone node. The intention behind SERVER-20368 was to provide users with a way to still be able to perform an update that changed the size of an oplog entry because that explicitly became forbidden as part of the design for SERVER-19551. However, SERVER-20529 made it an error to try and change the size of a document in any capped collection, so SERVER-20368 was thought to be unnecessary.

Given that your use case is read-only in order to verify the snapshot, I agree it makes sense not to have to sample from the oplog on startup because a truncation of the oplog shouldn't ever occur. We are therefore going to transform this ticket into a feature request so that WiredTigerKVEngine::initRsOplogBackgroundThread() skips starting a thread and thus skips sampling the oplog when the undocumented --queryableBackupMode option is specified (SERVER-593) in versions 3.3+. While I don't think we'd consider backporting this change to 3.2 under a different server parameter, it would be possible for you to use a pre-3.4 release (i.e. a 3.3 development version) to check that your snapshots are intact.

Let us know if you have any further questions or concerns.

Thanks,
Max

Comment by Ramon Fernandez Marina [ 10/May/16 ]

Thanks for your report paetling@gmail.com, we're dispatching it internally for consideration.

Comment by Travis Thieman [ 26/Apr/16 ]

To clarify, this is an issue primarily when attempting to start a mongod off of an uninitialized EBS volume, e.g. one that was just restored from a snapshot and has not yet read its data files in from S3. Details here: http://docs.aws.amazon.com/AWSEC2/latest/UserGuide/ebs-initialize.html

On an already-initialized EBS volume, startup times are only a second or two slower than pre-3.1.7 speeds in our experience (50GB oplog).

Our use case is attempting to verify that an EBS snapshot contains a valid Mongo data directory. To do this, we restore the snapshot to a volume, mount the volume, start a mongod against the volume's data directory, and attempt to query our collections.

Generated at Thu Feb 08 04:04:53 UTC 2024 using Jira 9.7.1#970001-sha1:2222b88b221c4928ef0de3161136cc90c8356a66.