[SERVER-16133] Deadlock detected with oplog and multikey index inserts Created: 13/Nov/14  Updated: 10/Dec/14  Resolved: 14/Nov/14

Status: Closed
Project: Core Server
Component/s: Concurrency, Replication
Affects Version/s: 2.8.0-rc0
Fix Version/s: None

Type: Bug Priority: Major - P3
Reporter: David Percy Assignee: Kaloian Manassiev
Resolution: Done Votes: 0
Labels: 28qa
Remaining Estimate: Not Specified
Time Spent: Not Specified
Original Estimate: Not Specified

Attachments: HTML File deadlock-log    
Operating System: ALL
Participants:

 Description   

We ran a workload of 30 client threads doing inserts into a collection with a multikey index. Each inserted document had a 10-element array as the multikey-indexed field.

We're seeing many "Deadlock found..." and "WriteConflictException while doing oplog application" log messages. Neither of these messages shows up when we don't use replication or when we remove the multikey index.

Log excerpt (full log attached):

 m31001| 2014-11-13T15:49:44.467-0500 I -        [repl writer worker 12] Deadlock found: Locker 82 waits for resource {17591420565866845252: Collection, 2198927570733355656, db0.coll0} held by [94, 93, 92, 91, 90, 89, 88, 87, 85, 86, 84, 83, ]
 m31001| Locker 83 waits for resource {17591420565866845252: Collection, 2198927570733355656, db0.coll0} held by [94, 93, 92, 91, 90, 89, 88, 87, 85, 86, 84, 82, ]
 m31001| Locker 84 waits for resource {17591420565866845252: Collection, 2198927570733355656, db0.coll0} held by [94, 93, 92, 91, 90, 89, 88, 87, 85, 86, 83, 82, ]
 m31001| Locker 85 waits for resource {17591420565866845252: Collection, 2198927570733355656, db0.coll0} held by [94, 93, 92, 91, 90, 89, 88, 87, 86, 84, 83, 82, ]
 m31001| Locker 86 waits for resource {17591420565866845252: Collection, 2198927570733355656, db0.coll0} held by [94, 93, 92, 91, 90, 89, 88, 87, 85, 84, 83, 82, ]
 m31001| Locker 87 waits for resource {17591420565866845252: Collection, 2198927570733355656, db0.coll0} held by [94, 93, 92, 91, 90, 89, 88, 85, 86, 84, 83, 82, ]
 m31001| Locker 88 waits for resource {17591420565866845252: Collection, 2198927570733355656, db0.coll0} held by [94, 93, 92, 91, 90, 89, 87, 85, 86, 84, 83, 82, ]
 m31001| Locker 89 waits for resource {17591420565866845252: Collection, 2198927570733355656, db0.coll0} held by [94, 93, 92, 91, 90, 88, 87, 85, 86, 84, 83, 82, ]
 m31001| Locker 90 waits for resource {17591420565866845252: Collection, 2198927570733355656, db0.coll0} held by [94, 93, 92, 91, 89, 88, 87, 85, 86, 84, 83, 82, ]
 m31001| Locker 91 waits for resource {17591420565866845252: Collection, 2198927570733355656, db0.coll0} held by [94, 93, 92, 90, 89, 88, 87, 85, 86, 84, 83, 82, ]
 m31001| Locker 92 waits for resource {17591420565866845252: Collection, 2198927570733355656, db0.coll0} held by [94, 93, 91, 90, 89, 88, 87, 85, 86, 84, 83, 82, ]
 m31001| Locker 93 waits for resource {17591420565866845252: Collection, 2198927570733355656, db0.coll0} held by [94, 92, 91, 90, 89, 88, 87, 85, 86, 84, 83, 82, ]
 m31001| Locker 94 waits for resource {17591420565866845252: Collection, 2198927570733355656, db0.coll0} held by [93, 92, 91, 90, 89, 88, 87, 85, 86, 84, 83, 82, ]
 m31001| 2014-11-13T15:49:44.468-0500 I REPLSETS [repl writer worker 12] WriteConflictException while doing oplog application on: db0.coll0, retrying.



 Comments   
Comment by Githook User [ 24/Nov/14 ]

Author:

{u'username': u'kaloianm', u'name': u'Kaloian Manassiev', u'email': u'kaloian.manassiev@mongodb.com'}

Message: Revert "SERVER-16133 Use RESOURCE_METADATA X lock to set the multiKey index value"

This reverts commit 58e49b2726807b7e7c2f0309607ffa454befbff6.
Branch: master
https://github.com/mongodb/mongo/commit/0ad070c47135f205c638b47e217d46d7da9fd329

Comment by Githook User [ 24/Nov/14 ]

Author:

{u'username': u'kaloianm', u'name': u'Kaloian Manassiev', u'email': u'kaloian.manassiev@mongodb.com'}

Message: SERVER-16133 Use RESOURCE_METADATA X lock to set the multiKey index value

This avoids the need to do deadlock detection and as a nice side effect
would prevent the scary deadlock messages from showing in the log.

Also, is a matter of correctness, we should not be setting the _isMultiKey
field to TRUE until the transaction commits.
Branch: master
https://github.com/mongodb/mongo/commit/58e49b2726807b7e7c2f0309607ffa454befbff6

Comment by Kaloian Manassiev [ 14/Nov/14 ]

This is an informational message only. It indicates that a deadlock was discovered and handled successfully by the server.

What happens in this particular case is that there are multiple concurrent inserts for the first time in a multi-key index, which need to set this property. The property may only be set under an exclusive collection lock, so all these inserts try to upgrade from IX to X. This is a deadlock since nobody can succeed, so they all yield the lock and one tries again with the X-lock directly.

The message is from the retry and is expected occasionally much like WriteConflict is.

We are working on improving the message though so it is more descriptive.

Generated at Thu Feb 08 03:40:04 UTC 2024 using Jira 9.7.1#970001-sha1:2222b88b221c4928ef0de3161136cc90c8356a66.