[SERVER-19766] Initial sync index build is sequential and single-threaded Created: 04/Aug/15  Updated: 17/Sep/15  Resolved: 17/Sep/15

Status: Closed
Project: Core Server
Component/s: Replication
Affects Version/s: None
Fix Version/s: None

Type: Bug Priority: Major - P3
Reporter: Jon Rangel (Inactive) Assignee: Scott Hernandez (Inactive)
Resolution: Done Votes: 1
Labels: None
Remaining Estimate: Not Specified
Time Spent: Not Specified
Original Estimate: Not Specified

Attachments: Text File rs2.log    
Issue Links:
Related
related to SERVER-19775 Do not apply the first oplog entry du... Closed
related to SERVER-676 use multiple cores for index sort-phase Closed
Operating System: ALL
Participants:

 Description   

In this comment in SERVER-9135, I see that indexes (other than the _id index) should now be built in parallel during initial sync.

However, in testing with 3.0.5 I see that all indexes on a collection are built sequentially. Furthermore the speed of each index build is CPU-bound on a single CPU core.

mongod log file from repro attached.



 Comments   
Comment by Scott Hernandez (Inactive) [ 17/Sep/15 ]

Jon, thanks for confirming (in a sep. email) that the indexes are built in parallel per collection as the system is designed. We have improved the system to not apply the first/last operation in the oplog separately which was what you were initially seeing – SERVER-19775.

Comment by Roy Reznik [ 12/Sep/15 ]

I am experiencing the exact same issue with MongoDB 3.0.6.
Index build phase of initial sync is sequential and VERY slow (probably 1 core bound as mentioned above).

Comment by Scott Hernandez (Inactive) [ 05/Aug/15 ]

If you add a document insert at the end, or create an index on another collection then you will see the indexes built at the same time.

Comment by Jon Rangel (Inactive) [ 05/Aug/15 ]

Hi Scott,

The repro steps are:

  1. Start a vanilla 2 data node + 1 arbiter replica set
  2. Load some data using POCDriver:

    java -jar POCDriver.jar
    

  3. Create a couple of secondary indexes:

    db.getSiblingDB("POCDB").POCCOLL.ensureIndex({fld6:1})
    db.getSiblingDB("POCDB").POCCOLL.ensureIndex({fld0:1})
    

  4. Shut down the secondary
  5. Remove all contents of the secondary's dbpath
  6. Restart the secondary
Comment by Scott Hernandez (Inactive) [ 05/Aug/15 ]

This is due to the fact that the last op in the source oplog, and the first used during the initial sync, was the create index command, which ran before the "bulk index" phase happened.

If you can provide your reproduction script we can alter it slightly to verify this.

Comment by Jon Rangel (Inactive) [ 05/Aug/15 ]

Log file attached. Covers period from startup of mongod with empty dbpath to completion of initial sync.

Generated at Thu Feb 08 03:52:00 UTC 2024 using Jira 9.7.1#970001-sha1:2222b88b221c4928ef0de3161136cc90c8356a66.