[SERVER-16922] Slow index build times on secondaries with Wired Tiger Created: 17/Jan/15 Updated: 03/Dec/21 Resolved: 22/Jan/15 |
|
| Status: | Closed |
| Project: | Core Server |
| Component/s: | Replication, WiredTiger |
| Affects Version/s: | None |
| Fix Version/s: | 3.0.0-rc6 |
| Type: | Bug | Priority: | Major - P3 |
| Reporter: | John Page | Assignee: | Atul Kachru |
| Resolution: | Done | Votes: | 1 |
| Labels: | None | ||
| Remaining Estimate: | Not Specified | ||
| Time Spent: | Not Specified | ||
| Original Estimate: | Not Specified | ||
| Attachments: |
|
||||
| Issue Links: |
|
||||
| Backwards Compatibility: | Fully Compatible | ||||
| Participants: | |||||
| Description |
|
Testing Migration to Wired Tiger from MMAP0 in 2.6 by adding a WT Replica. Servers: No load on source (primary) server Data Copying Phase: Creation of _id Index (_id is standard OID's) 2015-01-16T18:23:17.006+0000 I - [rsSync] Index Build: 123300/522771027 0% Currently 2015-01-17T19:09:58.004+0000 I - [rsSync] Index Build: 150625100/522771027 28% Not only over 24 hours for 28% but observably taking 4 minutes longer per 1% At 396 minutes for the final 1% Total predicted time is 396/2 * 100 = 330 Hours to build one index. By comparison - I tested 2.6 on the same hardware and this step took 7hrs 36 minutes. Is this the case for all WT index creations - is time to build polynomial (and greater than n log n I think) with number of records - Is this just increasing indexes or all indexes. If I had 3 or 4 secondary indexes on this are they built sequentially or in parallel? |
| Comments |
| Comment by Mark Callaghan [ 23/Jan/15 ] | |||||||||||||||||||||||||||||||
|
I get the problem using these files to load the database and then creating the indexes via the following: db.purchases_index.ensureIndex( {"price":1, "customerid":1}, {"background":false}); , {"background":false}); , {"background":false}); db.purchases_index.ensureIndex( {"price":1, "customerid":1}, {"background":true}); , {"background":true}); , {"background":true}); WiredTiger was used via: And mongo.conf had: | |||||||||||||||||||||||||||||||
| Comment by Mark Callaghan [ 23/Jan/15 ] | |||||||||||||||||||||||||||||||
|
With rc4 I get good performance for background:false index builds and lousy performance for background:true With rc5 I get lousy performance for background:false and it runs for so long I have yet to try background:true. The call stack above looks familiar. I didn't need a secondary to reproduce it, just ran with a 4G block cache and a 100G database. The indexes should be ~2G each when done. | |||||||||||||||||||||||||||||||
| Comment by Mark Benvenuto [ 22/Jan/15 ] | |||||||||||||||||||||||||||||||
|
Fixed with: https://github.com/mongodb/mongo/commit/876f5f475bf9242c098e6de641aa7ee0982803c2 CAP: Do we need a perf test around index builds on secondaries where the secondary wiredtiger? | |||||||||||||||||||||||||||||||
| Comment by Mark Benvenuto [ 22/Jan/15 ] | |||||||||||||||||||||||||||||||
|
Thanks to john.page for the excellent repro. Following his instructions, I was able to reproduce the issue on EC2 C2.2XL. Fundamentally, we believe issue was how WT handled splits. The good news is that I can no longer repro this with the latest WT import yesterday. (https://github.com/mongodb/mongo/commit/876f5f475bf9242c098e6de641aa7ee0982803c2). I believe it is a result of a fix by keith.bostic in the bt_split.c. In both cases pre and post fix, the maximum tree depth was 3. Details perf record -F 99 --call-graph dwarf -o perf6.data -p `pgrep mongod` – sleep 30
| |||||||||||||||||||||||||||||||
| Comment by John Page [ 21/Jan/15 ] | |||||||||||||||||||||||||||||||
|
Insert generator | |||||||||||||||||||||||||||||||
| Comment by John Page [ 21/Jan/15 ] | |||||||||||||||||||||||||||||||
|
I tested with 540M and 185M documents which is not a large set. Any decent | |||||||||||||||||||||||||||||||
| Comment by John Page [ 21/Jan/15 ] | |||||||||||||||||||||||||||||||
|
I tested with 540M and 185M documents which is not a large set. Any decent | |||||||||||||||||||||||||||||||
| Comment by John Page [ 21/Jan/15 ] | |||||||||||||||||||||||||||||||
|
I tested with 540M and 185M documents which is not a large set. Any decent | |||||||||||||||||||||||||||||||
| Comment by John Page [ 19/Jan/15 ] | |||||||||||||||||||||||||||||||
|
I'm seeing nice fast index build times when testing general index building so reproducing original issue. | |||||||||||||||||||||||||||||||
| Comment by John Page [ 17/Jan/15 ] | |||||||||||||||||||||||||||||||
|
I have halted instances buy may have the logs if root is ebs, this is | |||||||||||||||||||||||||||||||
| Comment by Scott Hernandez (Inactive) [ 17/Jan/15 ] | |||||||||||||||||||||||||||||||
|
Please include the version and logs for the server (which should include the startup stanza). | |||||||||||||||||||||||||||||||
| Comment by John Page [ 17/Jan/15 ] | |||||||||||||||||||||||||||||||
|
My bad, was supposed to be a bug |