[SERVER-19122] Out-of-order _id index inserts on secondary contribute to replica lag under WiredTiger Created: 24/Jun/15  Updated: 06/Dec/22  Resolved: 02/Apr/20

Status: Closed
Project: Core Server
Component/s: Replication, WiredTiger
Affects Version/s: None
Fix Version/s: None

Type: Bug Priority: Major - P3
Reporter: Bruce Lucas (Inactive) Assignee: Backlog - Storage Execution Team
Resolution: Done Votes: 0
Labels: None
Remaining Estimate: Not Specified
Time Spent: Not Specified
Original Estimate: Not Specified

Issue Links:
Depends
depends on WT-1979 Lower performance of out-of-order ins... Closed
is depended on by SERVER-18908 Secondaries unable to keep up with pr... Closed
Related
Assigned Teams:
Storage Execution
Operating System: ALL
Sprint: QuInt A (10/12/15), QuInt B (11/02/15)
Participants:

 Description   

In some use cases inserts into the _id index are done in monotonically increasing _id order on the primary, and that is a case that WT optimizes for. However because of the way oplog entries are divided among the worker threads in SyncTail::fillWriterVectors on the secondary, the entries in the _id index may not be inserted in order on the secondary, and this is a less optimal path under WT. This can occur for example if multiple collections are being inserted into, one thread per collection, using a monotonic _id, such as ObjectId; in this case inserts into the _id index are in order in each WT index table on the primary, but not on the secondary. (This issue does not occur with multiple threads inserting into a single collection because the _id index inserts are not in order on either the primary or the secondary, so parity is maintained).

We can control whether inserts are ordered or disordered on the primary by either

  • inserting into either 1 collection with 8 threads (disordered), or 8 separate collections with 8 threads (ordered)
  • using either monotonic _ids (ordered) or random _ids (disordered)

Inserts are always disordered on the secondary, so by observing secondary performance relative to primary while varying ordering on the primary as above we can measure the relative impact of the disordered inserts on the secondary. Secondary processing rate relative to primary was measured by comparing number of entries inserted into oplog on secondary with number on primary at a particular point in time during the run:

                                        secondary/primary   relative
                                                            performance
8 collections, monotonic _ids		15273776/20179477 = 0.76           inserts ordered on primary, disordered on secondary
8 collections, random ids		17189861/20104144 = 0.86           inserts disordered on both primary and secondary
1 collection, monotonic _ids		17123525/20187676 = 0.85           inserts disordered on both primary and secondary

Note that performance of primary relative to secondary is reduced by either using random _ids, or by inserting into a single collection with multiple threads, both of which disorder the inserts into the _id index on the primary, to match the disordered inserts on the secondary.

Finally, we can achieve parity between the primary and the secondary by both

  • incorporating the potential improvement of parallelizing oplog inserts (both bullets below)
  • eliminating the issue described in this ticket by disordering the inserts on the primary by using a single collection (second bullet below)

                                            secondary/primary   relative
                                                                performance
    8 collections, monotonic _ids		18699167/20148811 = 0.93
    1 collection, monotonic _ids		20056367/20060385 = 0.99
    

It's unclear whether this issue can be addressed in the mongod layer, or whether an improvement in WT for the out-of-order insertion case is needed.



 Comments   
Comment by Daniel Gottlieb (Inactive) [ 20/Oct/17 ]

milkie marking as 3.7 Desired - conversation at nyc storage meeting was that we will likely be doing some work on this soon. give a shout if that's incorrect.

Comment by Martin Bligh [ 28/Aug/15 ]

Observation from today: Part of the "out of orderness" is oplog optimes on primary do not strictly match ordering of the _id

Another interesting experiment ... run primary flat out for 60s. Get about 200-210K/s on primary, secondary is only getting about 140K/s.
After the primary completes, insert rate on secondary goes up to 200-210K/s.
Will investigate further if this is really due to out of order, or maybe some form of contention on the oplog handoff.

Comment by Martin Bligh [ 28/Aug/15 ]

Tried turning off the oplog commits on the secondary (just force commenting out in code). Even without oplog, secondary only goes half the speed of the primary.
Despite the primary struggling against _logOp locking issues, if I cut those out of profile, we still get the below, which I believes just reinforces what Bruce was pointing out

Specifically __wt_search_insert seems to be all added overhead - Michael, is this just straight overhead of not doing an append-only insert?

PRIMARY

-   3.83%  mongod  mongod               [.] __wt_row_search                                                                                                        
   - __wt_row_search                                                                                                                                               
      - 88.06% __wt_btcur_insert                                                                                                                                   
         - __curfile_insert                                                                                                                                        
            + 63.69% mongo::WiredTigerRecordStore::insertRecord(mongo::OperationContext*, char const*, int, bool)                                                  
            + 36.31% mongo::WiredTigerIndexUnique::_insert(__wt_cursor*, mongo::BSONObj const&, mongo::RecordId const&, bool)                                      
      + 9.76% __split_multi_inmem                                                                                                                                  
      + 1.81% __curfile_insert                                                                                                                                     
-   3.35%  mongod  mongod               [.] __wt_row_modify                                                                                                        
   - __wt_row_modify                                                                                                                                               
      - 81.71% __wt_btcur_insert                                                                                                                                   
         - __curfile_insert                                                                                                                                        
            + 69.25% mongo::WiredTigerRecordStore::insertRecord(mongo::OperationContext*, char const*, int, bool)                                                  
            + 30.75% mongo::WiredTigerIndexUnique::_insert(__wt_cursor*, mongo::BSONObj const&, mongo::RecordId const&, bool)                                      
      + 8.12% __split_multi_inmem                                                                                                                                  
      + 6.94% __wt_btcur_range_truncate                                                                                                                            
      + 3.08% __curfile_insert

SECONDARY, no oplog

-   3.26%  mongod  mongod               [.] __wt_search_insert                                                                                                     
   - __wt_search_insert                                                                                                                                            
      - 98.71% __wt_row_search                                                                                                                                     
           __wt_btcur_insert                                                                                                                                       
         - __curfile_insert                                                                                                                                        
            + 85.65% mongo::WiredTigerIndexUnique::_insert(__wt_cursor*, mongo::BSONObj const&, mongo::RecordId const&, bool)                                      
            + 14.35% mongo::WiredTigerRecordStore::insertRecord(mongo::OperationContext*, char const*, int, bool)                                                  
      + 1.29% __wt_btcur_insert                                                                                                                                    
-   2.54%  mongod  mongod               [.] __wt_row_search                                                                                                        
   - __wt_row_search                                                                                                                                               
      - 97.02% __wt_btcur_insert                                                                                                                                   
         - __curfile_insert                                                                                                                                        
            + 55.14% mongo::WiredTigerRecordStore::insertRecord(mongo::OperationContext*, char const*, int, bool)                                                  
            + 44.84% mongo::WiredTigerIndexUnique::_insert(__wt_cursor*, mongo::BSONObj const&, mongo::RecordId const&, bool)                                      
      + 2.96% __curfile_insert                                                                                                                                     
-   2.36%  mongod  mongod               [.] __wt_row_modify                                                                                                        
   - __wt_row_modify                                                                                                                                               
      - 96.46% __wt_btcur_insert                                                                                                                                   
         - __curfile_insert                                                                                                                                        
            + 58.04% mongo::WiredTigerRecordStore::insertRecord(mongo::OperationContext*, char const*, int, bool)                                                  
            + 41.96% mongo::WiredTigerIndexUnique::_insert(__wt_cursor*, mongo::BSONObj const&, mongo::RecordId const&, bool)

Comment by Bruce Lucas (Inactive) [ 24/Jun/15 ]

Created WT-1979 to investigate whether this can be addressed in WT.

Generated at Thu Feb 08 03:49:54 UTC 2024 using Jira 9.7.1#970001-sha1:2222b88b221c4928ef0de3161136cc90c8356a66.