[SERVER-18190] Secondary reads may block replication Created: 23/Apr/15  Updated: 19/Sep/15  Resolved: 08/May/15

Status: Closed
Project: Core Server
Component/s: Concurrency, Querying
Affects Version/s: 3.0.2
Fix Version/s: 3.0.4, 3.1.3

Type: Bug Priority: Critical - P2
Reporter: Bruce Lucas (Inactive) Assignee: Geert Bosch
Resolution: Done Votes: 2
Labels: ET
Remaining Estimate: Not Specified
Time Spent: Not Specified
Original Estimate: Not Specified

Attachments: HTML File gdbmon.html     PNG File secondary_reads.png    
Issue Links:
Depends
Duplicate
is duplicated by SERVER-18200 Long running queries on secondary cau... Closed
is duplicated by SERVER-18325 MongoDb 3.0.2 background index creati... Closed
Related
Backwards Compatibility: Fully Compatible
Operating System: ALL
Backport Completed:
Sprint: Quint Iteration 3
Participants:

 Description   
Issue Status as of Jun 09, 2015

ISSUE SUMMARY
Reading from secondary nodes in a replica set may block the application of replication write operations, because longer read operations may not yield appropriately.

USER IMPACT
High volume read operations on secondary nodes may cause the nodes to experience increased replication lag, which may make read operations return old data.

In extreme cases the affected node may become "stale". Stale nodes need to be resynchronized. If enough nodes in a replica set become stale availability may be impacted.

WORKAROUNDS
The preferred workaround is to suspend all read operations on secondary nodes.

Alternatively, the oplog size can be increased on secondary nodes. This is only a suitable workaround if the nodes undergo periods of no reads so replication can catch up.

AFFECTED VERSIONS
MongoDB 3.0.0 through 3.0.3.

FIX VERSION
The fix is included in the 3.0.4 production release.

Original description

  • 3 table scans each taking 5-10 seconds (and returning no results) were done on a collection of about 12M documents on the secondary, marked A-B, C-D, E-F above. At the same time documents were inserted into the same collection on the primary, driving replication traffic.
  • During the table scans replication rate falls to 0, replication lag builds.
  • Graphs show straight lines between the beginning and end of the stalls, indicating that the serverStatus command that the data collection depends on was blocked as well.
  • Primary is not similarly affected by the same table scan.
  • Problem reproduces on both WiredTiger and mmapv1


 Comments   
Comment by Ramon Fernandez Marina [ 11/Jun/15 ]

m.cuk, apologies for the inaccuracies, I'll update JIRA. 3.0.4 was delayed about a week, but the 3.0.4-rc0 release candidate contains a fix for this issue and is available for download. If you were affected by this bug it would be very helpful if you could try 3.0.4-rc0 out and confirm that your problem is indeed fixed.

Thanks,
Ramón.

Comment by Bruce Lucas (Inactive) [ 11/Jun/15 ]

A release candidate 3.0.4-rc0 is available for testing (only) in the "development releases" section of the download site. It is not ready for production use yet, but if this release candidate passes our tests it will become the production 3.0.4 release.

Comment by Matjaž ?uk [ 11/Jun/15 ]

So the JIRA versions page says :
3.0.4 09/Jun/15 Stable

Today is 11/Jun/15 and under downloads there is still only 3.0.3.

Comment by Githook User [ 13/May/15 ]

Author:

{u'username': u'GeertBosch', u'name': u'Geert Bosch', u'email': u'geert@mongodb.com'}

Message: SERVER-18190: Make ParallelBatchWriterMode use a LockManager managed lock

(cherry picked from commit 465ba933e8d6f5ad9173c4c806686b915bfffe1c)

Conflicts:
src/mongo/db/concurrency/lock_state.cpp
src/mongo/db/stats/fill_locker_info.cpp
src/mongo/db/stats/fill_locker_info.h
Branch: v3.0
https://github.com/mongodb/mongo/commit/1a4f1719af7b4959564df7c22d72ec03f3938a91

Comment by Ramon Fernandez Marina [ 11/May/15 ]

m.cuk, we're currently working on 3.0.3. Once we have a timeframe for 3.0.4 we'll update the JIRA versions page.

Comment by Matjaž ?uk [ 11/May/15 ]

Hi,

do you have any time estimates when 3.04 will be released?

Comment by Githook User [ 08/May/15 ]

Author:

{u'username': u'GeertBosch', u'name': u'Geert Bosch', u'email': u'geert@mongodb.com'}

Message: SERVER-18190: Fix constant for fillLockerInfo array size
Branch: master
https://github.com/mongodb/mongo/commit/5f719f5266b64fe0e35fa38d842bbca2319720f3

Comment by Githook User [ 07/May/15 ]

Author:

{u'username': u'GeertBosch', u'name': u'Geert Bosch', u'email': u'geert@mongodb.com'}

Message: SERVER-18190: Make ParallelBatchWriterMode use a LockManager managed lock
Branch: master
https://github.com/mongodb/mongo/commit/465ba933e8d6f5ad9173c4c806686b915bfffe1c

Comment by Daniel Pasette (Inactive) [ 29/Apr/15 ]

The patch is in progress. The commit will show up as a comment on this ticket as usual.

Comment by David Murphy [ 29/Apr/15 ]

Is there a github commit on this yet? We are trying to do some testing with 3.0 but this is a blocker for a test, even a manual patch so the test can proceed would be appreciated until 3.0.4 is a tag in the github.

Thanks
David

Comment by Kaloian Manassiev [ 23/Apr/15 ]

From looking at the stacks, I think the problem is that yielding (QueryYield::yieldAllLocks) does not know about the parallel-batch-writer lock, which is acquired by the RAII objects and is not off the lock manager. That way even though all other locks get yielded, the PBWR lock is still held.

I think the only way to fix this would be to move the PBWR lock to be on the lock manager, so that Locker::saveLockStateAndUnlock would release it as well.

This is definitely a regression from 2.6, because back then the yielding code was going directly through the RAII objects on the context.

Comment by Bruce Lucas (Inactive) [ 23/Apr/15 ]

Log shows that the table scans are yielding, but that does not seem to be sufficient to avoid blocking replications.

2015-04-23T12:52:11.122-0400 I QUERY    [conn4] query test.c query: { x: true } planSummary: COLLSCAN ntoskip:0 nscanned:0 nscannedObjects:14490813 keyUpdates:0 writeConflicts:0 numYields:113210 nreturned:0 reslen:20 locks:{ Global: { acquireCount: { r: 113211 } }, Database: { acquireCount: { r: 113211 } }, Collection: { acquireCount: { r: 113211 } } } 10843ms

Comment by Eric Milkie [ 23/Apr/15 ]

I would have expected the behavior to be that the table scans should have yielded to other operations, including replication applier. The investigation may want to start by examining the behavior there.

Generated at Thu Feb 08 03:46:50 UTC 2024 using Jira 9.7.1#970001-sha1:2222b88b221c4928ef0de3161136cc90c8356a66.