[SERVER-17014] foreground index build blocks database reads and writes Created: 22/Jan/15  Updated: 17/Jan/19  Resolved: 17/Jan/19

Status: Closed
Project: Core Server
Component/s: Index Maintenance
Affects Version/s: 2.8.0-rc5
Fix Version/s: None

Type: Bug Priority: Major - P3
Reporter: Mark Callaghan Assignee: Louis Williams
Resolution: Done Votes: 6
Labels: None
Remaining Estimate: Not Specified
Time Spent: Not Specified
Original Estimate: Not Specified

Issue Links:
Related
related to SERVER-37270 Remove foreground index build functio... Closed
Operating System: ALL
Steps To Reproduce:

1) start an index create that takes 10+ seconds for a collection in database "a"
2) wait a few seconds and then get thread stacks using PMP – http://poormansprofiler.org/

Sprint: Storage NYC 2019-01-28
Participants:
Case:
Story Points: 0

 Description   

Some discussion is at https://groups.google.com/forum/#!topic/mongodb-dev/_1IrogzovEQ. When I create an index with background:false then many (all?) operations in the db are blocked even for engines like WiredTiger that don't require a per-db writer lock. The URL above shows thread stacks where background jobs (TTLMonitor, ClientCursorMonitor) get blocked on a per-db lock by a background:false index create. I assume bad things can happen when TTL enforcement doesn't run for too long.

This creates other problems as "show collections", db.$foo.getIndexes() and queries from other collections in the same database will be blocked for the duration of the index create.

While background:true is the workaround background index creation can take more time.



 Comments   
Comment by Louis Williams [ 17/Jan/19 ]

As of SERVER-37270, all indexes are built in the background. Index builds no longer block reads or writes for their entire duration. Instead there are very brief periods at completion where index builds block writes.

Comment by Charles Sarrazin (Inactive) [ 22/Mar/18 ]

I guess the only thing I see which could go wrong with using the oplog to permit writes on an ongoing index creation would be for unique indexes. Indeed, in this case you might reach a point where you might already have duplicate documents, preventing the index from being created.

The problem here is that we might have an acknowledged write sent to the client, appearing in the oplog, but which would normally fail because of the unique constraint. In this case, we would either need to drop the index, or purge the oplog entries which would no longer be valid (which would actually be pretty bad for consistency, as the effect would be more or less be similar rollback).

Comment by Eric Milkie [ 23/Jan/15 ]

We could improve foreground index builds in the following manner:
We could convert the DB X lock to Collection IX before doing the collection scan, and then convert back to DB X lock at the end of the build, similar to how it is already done for background index builds. There is no need to hold a DB X lock while scanning the collection; making this change will permit reads to the collection during the index build, but writes will still block.
TTL could then be improved to use trylock with a timeout for each collection where deletes need to happen.

Longer term, we could improve foreground index builds even more by making use of the oplog to permit writes while scanning the collection, similar to how initial sync works. After the collection scan and btree build is complete, we catch up by applying all the index changes for the writes we see in the oplog.

Generated at Thu Feb 08 03:43:01 UTC 2024 using Jira 9.7.1#970001-sha1:2222b88b221c4928ef0de3161136cc90c8356a66.