[SERVER-5685] Have multiple threads applying oplog ops Created: 23/Apr/12  Updated: 09/Jul/13  Resolved: 21/Jun/12

Status: Closed
Project: Core Server
Component/s: Replication
Affects Version/s: None
Fix Version/s: 2.1.2

Type: New Feature Priority: Major - P3
Reporter: Kristina Chodorow (Inactive) Assignee: Eric Milkie
Resolution: Done Votes: 0
Labels: None
Remaining Estimate: Not Specified
Time Spent: Not Specified
Original Estimate: Not Specified

Issue Links:
Depends
depends on SERVER-4392 split network thread out from worker ... Closed
Related
Participants:

 Comments   
Comment by auto [ 22/Jun/12 ]

Author:

{u'date': u'2012-06-22T14:23:44-07:00', u'email': u'milkie@10gen.com', u'name': u'Eric Milkie'}

Message: SERVER-5685 don't block for 1 second in peek() calls

This was causing delays when waiting on writes on the primary since
we didn't apply single ops immediately as they came in on the secondary.
Branch: master
https://github.com/mongodb/mongo/commit/e5b586a0fd0978802560e2319da38e08e3bab7df

Comment by auto [ 20/Jun/12 ]

Author:

{u'date': u'2012-06-20T09:03:54-07:00', u'email': u'milkie@10gen.com', u'name': u'Eric Milkie'}

Message: SERVER-5685 do auth correctly for oplog applying
Branch: master
https://github.com/mongodb/mongo/commit/ad6c8701bc53afde4212c3eb5f9e1466181296a0

Comment by auto [ 19/Jun/12 ]

Author:

{u'date': u'2012-06-19T12:07:58-07:00', u'name': u'Eric Milkie', u'email': u'milkie@10gen.com'}

Message: SERVER-5685 check end-batch-early status for all calls to tryPopAndWaitForMore

By not checking the return status, it was possible to have a replcation oplog batch
that contained both a $cmd and other operations, which would interleave things in a bad way.
Branch: master
https://github.com/mongodb/mongo/commit/7d932daed48e51cea98129be39d2ae296aa27b3f

Comment by auto [ 19/Jun/12 ]

Author:

{u'date': u'2012-06-19T06:50:18-07:00', u'email': u'milkie@10gen.com', u'name': u'Eric Milkie'}

Message: SERVER-5685 multiple threads prefetch and apply ops on secondaries

On a secondary, ops are now pulled off the network in one thread and
placed in a deque. Another thread pulls off the ops in batches. These
batches are presented to a prefetch thread pool (which prefetches all
the pages for index traversal and data), and a writer thread pool
(where operations are applied to the database). Each op is assigned a
thread based on a hash of its namespace, in order to maintain write
order within a given document. During the write phase, all readers
are blocked (even if a given write hits a yieldpoint).
Write concern is updated on the primary by using a separate query
cursor.
Branch: master
https://github.com/mongodb/mongo/commit/bdded78e9f2ba6c703811a7e20e0b90adfc0d5b3

Comment by auto [ 19/Jun/12 ]

Author:

{u'date': u'2012-06-19T06:49:05-07:00', u'email': u'milkie@10gen.com', u'name': u'Eric Milkie'}

Message: SERVER-5685 display byte size of bg buffer for log level 2
Branch: master
https://github.com/mongodb/mongo/commit/d1a4e07a69a04448852e5141ca19b23310929d22

Comment by auto [ 13/Jun/12 ]

Author:

{u'date': u'2012-06-13T14:49:53-07:00', u'email': u'milkie@10gen.com', u'name': u'Eric Milkie'}

Message: SERVER-5685 move waitForMore() to Interface because SyncTail will need it
Branch: master
https://github.com/mongodb/mongo/commit/0729e02da009429683b791949e7c2a2912f2428c

Comment by auto [ 13/Jun/12 ]

Author:

{u'date': u'2012-06-13T12:43:37-07:00', u'email': u'milkie@10gen.com', u'name': u'Eric Milkie'}

Message: SERVER-5685 add waitForMore() helper to bgsync
Branch: master
https://github.com/mongodb/mongo/commit/89fc01bee31cf4812a737123a755ed79452ed2db

Comment by auto [ 13/Jun/12 ]

Author:

{u'date': u'2012-06-13T08:01:37-07:00', u'email': u'milkie@10gen.com', u'name': u'Eric Milkie'}

Message: SERVER-5685 modifications to prefetch

Comment by auto [ 13/Jun/12 ]

Author:

{u'date': u'2012-06-13T09:57:43-07:00', u'email': u'milkie@10gen.com', u'name': u'Eric Milkie'}

Message: SERVER-5685 add peek method to threadsafe queue
Branch: master
https://github.com/mongodb/mongo/commit/4383b3de114404a485d3ad91110ea0fbff2c7790

Comment by auto [ 06/Jun/12 ]

Author:

{u'login': u'dwight', u'name': u'Dwight', u'email': u'dwight@10gen.com'}

Message: SERVER-5685 Implmentation of a special-writers-only-mode which could be used for speeding up replication application on secondaries.
Branch: master
https://github.com/mongodb/mongo/commit/f60f9a32257c9e9c08b204ae6a405932f1964108

Comment by auto [ 01/Jun/12 ]

Author:

{u'login': u'milkie', u'name': u'Eric Milkie', u'email': u'milkie@10gen.com'}

Message: prefetch code for support of SERVER-5685
Branch: master
https://github.com/mongodb/mongo/commit/77649a165ee800c7b519b91461084b7a619d279e

Comment by Kristina Chodorow (Inactive) [ 29/May/12 ]

@Scott: the primary is no longer keeping track of sync by where the secondary has getMore'd to. The secondary keeps a second connection open, which it updates whenever lastOpTimeWritten is updated.

Comment by Scott Hernandez (Inactive) [ 09/May/12 ]

How will this interact with oplog queries upstream? When do the queries get sent for more, before or during the creation of the batches?

These questions of course matter for the replication state from the primaries point of view.

Comment by Eric Milkie [ 09/May/12 ]

Summary of implementation plan:
On an RS secondary, a thread pool of writer threads (32 threads in pool). We divide a region of oplog into a batch: 128 operations or a db command, whichever comes first. Each oplog item in the batch will be assigned a writer thread from the pool based on collection (hashed). Each writer thread will pretouch records involved for each operation, using read locks. At the end of pretouching phase, a barrier will block synchronize all writing threads. At this point, a special replication lock is taken which will block any readers. Then, all threads will grab w locks and proceed to apply all operations in their batches. If all threads are successful, we unlock the special readerblocking lock and update the primary as to our replication progress (write concern). If any thread fails, we restart the writing phase from the beginning, while continuing to hold the special lock.
At this point, we yield to readers and then begin again with a new batch of operations from the oplog.

Generated at Thu Feb 08 03:09:36 UTC 2024 using Jira 9.7.1#970001-sha1:2222b88b221c4928ef0de3161136cc90c8356a66.