[SERVER-14737] Initial sync uses background index building Created: 31/Jul/14  Updated: 20/Apr/22  Resolved: 13/Aug/14

Status: Closed
Project: Core Server
Component/s: Index Maintenance
Affects Version/s: 2.6.3
Fix Version/s: 2.7.5

Type: Bug Priority: Major - P3
Reporter: Andrew Ryder (Inactive) Assignee: Mathias Stearn
Resolution: Done Votes: 0
Labels: cap-ticket-needed
Remaining Estimate: Not Specified
Time Spent: Not Specified
Original Estimate: Not Specified

Issue Links:
Related
related to SERVER-21100 InitialSync finishes before indexes f... Closed
related to SERVER-65818 [5.0] Initial sync uses hybrid index ... Closed
Tested
Backwards Compatibility: Fully Compatible
Operating System: ALL
Steps To Reproduce:
  1. Start with a 3-node replica-set, populate a database with a couple million records (exact number doesn't matter)
  2. Create a custom index specifying background:true
  3. Wait for index build completion
  4. Kill one node and delete content of dbpath
  5. Restart dead node, observe initial sync, node uses background index build for the custom index added at step 2, but also still waits for it to complete.
Participants:
Case:

 Description   

When a node is performing initial sync in 2.6.3 it will build indexes using the method specified by the original index build command even though the node does not become a viable SECONDARY until the index builds are complete.

Eg.

2014-07-31T13:22:29.077+1000 [rsSync] replSet initial sync clone all databases
2014-07-31T13:22:29.089+1000 [rsSync] replSet initial sync cloning db: test
...
2014-07-31T13:22:36.059+1000 [rsSync] build index on: test.ts properties: { v: 1, key: { _id: 1 }, name: "_id_", ns: "test.ts" }
2014-07-31T13:22:36.059+1000 [rsSync] 	 building index using bulk method
2014-07-31T13:22:42.870+1000 [rsSync] build index done.  scanned 2000000 total records. 6.81 secs
2014-07-31T13:22:43.030+1000 [rsSync] replSet initial sync cloning db: admin
2014-07-31T13:22:43.045+1000 [rsSync] replSet initial sync data copy, starting syncup
2014-07-31T13:22:43.045+1000 [rsSync] oplog sync 1 of 3
2014-07-31T13:22:43.382+1000 [rsSync] oplog sync 2 of 3
2014-07-31T13:22:43.382+1000 [rsSync] replSet initial sync building indexes
2014-07-31T13:22:43.382+1000 [rsSync] replSet initial sync cloning indexes for : test
2014-07-31T13:22:43.384+1000 [rsSync] build index on: test.ts properties: { v: 1, key: { server: 1.0, cpu: 1.0, ts: 1.0 }, name: "server_1_cpu_1_ts_1", ns: "test.ts", background: true }
2014-07-31T13:22:43.384+1000 [rsSync] 	 building index in background
2014-07-31T13:22:43.878+1000 [rsBackgroundSync] replSet syncing to: localhost:27118
2014-07-31T13:22:43.879+1000 [rsBackgroundSync] replset setting syncSourceFeedback to localhost:27118
2014-07-31T13:22:46.000+1000 [rsSync] 		Index Build(background): 390700/2000000	19%
2014-07-31T13:22:49.000+1000 [rsSync] 		Index Build(background): 808100/2000000	40%
...

I would expect the node to either build indexes using the foreground process or allow the node to enter SECONDARY status.



 Comments   
Comment by Githook User [ 13/Aug/14 ]

Author:

{u'username': u'RedBeard0531', u'name': u'Mathias Stearn', u'email': u'mathias@10gen.com'}

Message: SERVER-13951 Split index building in to UnitOfWork-sized stages

All index builds now go through the MultiIndexBuilder as its API was already
close to ideal. The following tickets have also been addressed by this commit:

SERVER-14710 Remove dropDups
SERVER-12309 Cloner build indexes in parallel
SERVER-14737 Initial sync uses bg index building
SERVER-9135 fast index build for initial sync
SERVER-2747 can't kill index in phase 2
SERVER-8917 check error code rather than assuming all errors are dups
SERVER-14820 compact enforces unique while claiming not to
SERVER-14746 IndexRebuilder should be foreground and fail fatally
Branch: master
https://github.com/mongodb/mongo/commit/00913e47de5aced5267e44e82ac9e976bbaac089

Comment by Eric Milkie [ 31/Jul/14 ]

The correct behavior for initial sync will be to build all indexes in the foreground and in a bulk operation that reduces multiple scans across the data.

Comment by Eric Milkie [ 31/Jul/14 ]

I believe it may be true that the cloner builds indexes with

{background:true}

as background indexes. The cloner is used in initial syncing and to implement the various clone/copy collection/DB commands.
I'm pretty sure the cloner waits for each index build to complete (foreground or background) before proceeding on to the next index, so there should be no overlapping.
However, in the case of initial sync, if an index build is started after the initial sync has begun, such an index build is applied using the oplog and thus background index builds may overlap with other operations, just as in normal replication.

Generated at Thu Feb 08 03:35:49 UTC 2024 using Jira 9.7.1#970001-sha1:2222b88b221c4928ef0de3161136cc90c8356a66.