[SERVER-29123] Why is ParallelBatchWriterMode used when Applying Oplogs Created: 11/May/17  Updated: 12/May/17  Resolved: 11/May/17

Status: Closed
Project: Core Server
Component/s: Replication
Affects Version/s: None
Fix Version/s: None

Type: Question Priority: Critical - P2
Reporter: deyukong Assignee: Andy Schwerin
Resolution: Done Votes: 0
Labels: None
Remaining Estimate: Not Specified
Time Spent: Not Specified
Original Estimate: Not Specified

Issue Links:
Related
related to SERVER-20328 Allow secondary reads while applying ... Closed
Participants:

 Description   

In function OpTime SyncTail::multiApply, ParallelBatchWriterMode is used.
If primary is writing heavily, the latency of reading slaves will be very unstable. In our production environment, it may be 400 to 1000ms,
I know that ParallelBatchWriterMode is a global resource lock, which will block every read operation on every collections.
My question is that: why is it necessary to stop reads on collections other than oplog.rs?
Why can not a lock on local db satisfy the concurrency model?



 Comments   
Comment by deyukong [ 12/May/17 ]

to who is not sensetive to the inconsistent view, but sensetive to the read latency. It depends on the users.

Comment by Andy Schwerin [ 11/May/17 ]

Acceptable to whom?

Comment by deyukong [ 11/May/17 ]

thanks. I have almost the same considerations.
I think the inconsistent view is acceptable, but I still cant tell if some dangerous side-effect hides somewhere if I simply alter the PBWM to a dblock or tablelock.

Comment by Andy Schwerin [ 11/May/17 ]

In order to perform writes with the same throughput as the primary, secondary nodes need to apply writes in parallel, just as primaries do. However, secondaries cannot tell from the information in the oplog what parallelizations will cause clients to always see a consistent view of the data. To work around this, the secondaries apply oplog entries in arbitrarily selected batches, using rules that ensure that at the end of a batch, the view over the data is consistent. However, during the batch application, the view may not be consistent, and so readers must be blocked. The ParallelBatchWriterMode lock is used to perform this synchronization.

As you have noticed, when there is heavy write load, readers on secondaries can find themselves stuck waiting for batch boundaries to arrive in order to perform reads. This problem is observed in SERVER-21858. However, on storage engines that support MVCC, there is an alternative outlined in SERVER-20328: with some work, we could allow readers on secondaries to read from the view of the data at the end of the most recently competed batch. I would recommend that you watch and vote on SERVER-20328, if you are interested.

Generated at Thu Feb 08 04:19:57 UTC 2024 using Jira 9.7.1#970001-sha1:2222b88b221c4928ef0de3161136cc90c8356a66.