[SERVER-38121] multikey index ops in a transaction can cause secondaries to hang Created: 13/Nov/18 Updated: 28/Nov/18 Resolved: 19/Nov/18 |
|
| Status: | Closed |
| Project: | Core Server |
| Component/s: | Replication |
| Affects Version/s: | 4.1.5 |
| Fix Version/s: | None |
| Type: | Task | Priority: | Major - P3 |
| Reporter: | Randolph Tan | Assignee: | Siyuan Zhou |
| Resolution: | Duplicate | Votes: | 0 |
| Labels: | None | ||
| Remaining Estimate: | Not Specified | ||
| Time Spent: | Not Specified | ||
| Original Estimate: | Not Specified | ||
| Issue Links: |
|
||||||||
| Participants: | |||||||||
| Description |
|
If a replication batch has an operation that causes the setIndexIsMultikey to get triggered, it can block indefinitely trying to acquire the database X lock if a prepared transaction is holding on it. |
| Comments |
| Comment by Judah Schvimer [ 28/Nov/18 ] | ||||||||||||||||||||||||||
|
In that case, multikey writes probably do require the DB X lock. The "finer grained locking for DDL ops" project can address that if it wants, but | ||||||||||||||||||||||||||
| Comment by Geert Bosch [ 27/Nov/18 ] | ||||||||||||||||||||||||||
|
We rely on the DB X lock to ensure that collection catalog data is not changed concurrently with readers accessing it. If the data you're changing is otherwise protected against concurrent access, you shouldn't need it. | ||||||||||||||||||||||||||
| Comment by Judah Schvimer [ 20/Nov/18 ] | ||||||||||||||||||||||||||
|
siyuan.zhou, I agree a database IX lock would be sufficient. This write is to the catalog record for the collection, so I believe a collection X lock is required, geert.bosch may be able to answer better. | ||||||||||||||||||||||||||
| Comment by Siyuan Zhou [ 20/Nov/18 ] | ||||||||||||||||||||||||||
|
judah.schvimer, I believe the code Randolph referring to is this line. However, I don't see why an X lock on database is necessary. Can we acquire the database lock in IX mode instead? Moreover, it seems the lock mode of setMultikey on primary is nothing different than normal insert, so I assume it's IX on both database and collection. Can we make the secondary the same as the primary since we are already here? Nevertheless, | ||||||||||||||||||||||||||
| Comment by Siyuan Zhou [ 19/Nov/18 ] | ||||||||||||||||||||||||||
|
I believe this will be solved by | ||||||||||||||||||||||||||
| Comment by Tess Avitabile (Inactive) [ 19/Nov/18 ] | ||||||||||||||||||||||||||
|
siyuan.zhou, we hope that this will go away if we yield prepared transaction's locks on secondaries. Can you please confirm (and test) this in your implementation? | ||||||||||||||||||||||||||
| Comment by Randolph Tan [ 13/Nov/18 ] | ||||||||||||||||||||||||||
|
Sample oplog entry seen in practice:
|