[SERVER-50665] Index creation stuck with "Index Build: draining writes received during build" Created: 01/Sep/20  Updated: 06/Sep/20  Resolved: 06/Sep/20

Status: Closed
Project: Core Server
Component/s: Index Maintenance
Affects Version/s: 4.4.0
Fix Version/s: None

Type: Question Priority: Minor - P4
Reporter: Ivan Shcholokov Assignee: Dmitry Agranat
Resolution: Duplicate Votes: 0
Labels: Indexing, indexes
Remaining Estimate: Not Specified
Time Spent: Not Specified
Original Estimate: Not Specified
Environment:

CentOS Linux release 8.2.2004, db version v4.4.0


Issue Links:
Duplicate
is duplicated by SERVER-48516 at startup, confirm replica set node ... Closed
Participants:

 Description   

Good day,

I have 4 rows in the test collection:

{{{ "_id" : ObjectId("5f4ce50e19b13337216dd477"), "test" : 1 }{ "_id" : ObjectId("5f4ce50e19b13337216dd478"), "test" : 2 }{ "_id" : ObjectId("5f4ce50e19b13337216dd479"), "test" : 3 }{ "_id" : ObjectId("5f4ce50e19b13337216dd47a"), "test" : 4 }}}

After running db.test.createIndex({test:1},{background:1}); to create an index, it just hangs. It was hanging for at least a few hours. Here is what I found in the db.currentOp() about this operation:
 

{{{"type" : "op","host" : "HOSTNAME:27017","desc" : "IndexBuildsCoordinatorMongod-13","active" : true,"currentOpTime" : "2020-08-31T12:11:13.159+00:00","opid" : 8721867,"secs_running" : NumberLong(20),"microsecs_running" : NumberLong(20888590),"op" : "command","ns" : "test.test","command" : {"createIndexes" : "test","indexes" : [{"v" : 2,"key" : {"test" : 1},"name" : "test_1","background" : 1}],"lsid" : {"id" : UUID("07b43083-8ab9-4bcb-8768-919a3f27655f")},"$clusterTime" : {"clusterTime" : Timestamp(1598875647, 409),"signature" : {"hash" : BinData(0,"+/YcdPyQriT8RL1LtFUhxe2BtCE="),"keyId" : NumberLong("6861636045532823556")}},"$db" : "test"},"msg" : "Index Build: draining writes received during build","numYields" : 0,"locks" : {},"waitingForLock" : false,"lockStats" : {"ReplicationStateTransition" : {"acquireCount" : {"w" : NumberLong(6)}},"Global" : {"acquireCount" : {"r" : NumberLong(1),"w" : NumberLong(4)}},"Database" : {"acquireCount" : {"r" : NumberLong(1),"w" : NumberLong(4)}},"Collection" : {"acquireCount" : {"r" : NumberLong(1),"w" : NumberLong(3),"W" : NumberLong(1)}},"Mutex" : {"acquireCount" : {"r" : NumberLong(4)}}},"waitingForFlowControl" : false,"flowControlStats" : {"acquireCount" : NumberLong(3),"timeAcquiringMicros" : NumberLong(1)}}}}

This Index Build: draining writes received during build makes no sense to me since there was no read/writes to the test collection during index creation.

Also index creation hangs only in non-empty collection. Index creates successfully in empty collection.

I'm running MongoDB server as a single-node replica set.

I haven't found any mentions of this message in the google, so I assume it's not a bug, but an issue on my side.

What might be an issue in this case? I'm out of ideas.

PS: apologies if I selected wrong Project for this issue/question.



 Comments   
Comment by Dmitry Agranat [ 06/Sep/20 ]

Glad to hear this has resolved the reported issue bobgubko13@gmail.com, I will go ahead and close this case.

Comment by Ivan Shcholokov [ 03/Sep/20 ]

You are totally right, adding keyfile helped and createIndex works now. Dima, thanks for your time and help!

Comment by Dmitry Agranat [ 03/Sep/20 ]

Thanks for the clarification bobgubko13@gmail.com, it is always helpful to understand where you see reported issue. This issue might be related to SERVER-48516. The node can't communicate with itself so it will hang trying to commit the index build.

Comment by Ivan Shcholokov [ 02/Sep/20 ]

It's weird, because DB logs were not cleared or altered after initial setup. What log file should contain info about this operation — mongod.log or metric data?

Here is what I see when I run db.currentOp({msg:/drain/i}) (sorry for a long paste):

 

{
 "inprog" : [
 {
 "type" : "op",
 "host" : "HOST:27017",
 "desc" : "IndexBuildsCoordinatorMongod-15",
 "active" : true,
 "currentOpTime" : "2020-09-02T13:52:48.262+00:00",
 "opid" : 9246841,
 "secs_running" : NumberLong(164853),
 "microsecs_running" : NumberLong("164853680304"),
 "op" : "command",
 "ns" : "test.ttt",
 "command" : {
 "createIndexes" : "ttt",
 "indexes" : [
 {
 "v" : 2,
 "key" : {
 "aa" : 1
 },
 "name" : "aa",
 "background" : true
 }
 ],
 "$db" : "test"
 },
 "msg" : "Index Build: draining writes received during build",
 "numYields" : 0,
 "locks" : {
 
},
 "waitingForLock" : false,
 "lockStats" : {
 "ReplicationStateTransition" : {
 "acquireCount" : {
 "w" : NumberLong(6)
 }
 },
 "Global" : {
 "acquireCount" : {
 "r" : NumberLong(1),
 "w" : NumberLong(4)
 }
 },
 "Database" : {
 "acquireCount" : {
 "r" : NumberLong(1),
 "w" : NumberLong(4)
 }
 },
 "Collection" : {
 "acquireCount" : {
 "r" : NumberLong(1),
 "w" : NumberLong(3),
 "W" : NumberLong(1)
 }
 },
 "Mutex" : {
 "acquireCount" : {
 "r" : NumberLong(4)
 }
 }
 },
 "waitingForFlowControl" : false,
 "flowControlStats" : {
 "acquireCount" : NumberLong(3),
 "timeAcquiringMicros" : NumberLong(2)
 }
 },
 {
 "type" : "op",
 "host" : "HOST:27017",
 "desc" : "IndexBuildsCoordinatorMongod-16",
 "active" : true,
 "currentOpTime" : "2020-09-02T13:52:48.263+00:00",
 "opid" : 15762563,
 "secs_running" : NumberLong(127),
 "microsecs_running" : NumberLong(127132107),
 "op" : "command",
 "ns" : "test.ttt",
 "command" : {
 "createIndexes" : "ttt",
 "indexes" : [
 {
 "v" : 2,
 "key" : {
 "q" : 1
 },
 "name" : "q",
 "background" : true
 }
 ],
 "$db" : "test"
 },
 "msg" : "Index Build: draining writes received during build",
 "numYields" : 0,
 "locks" : {
 
},
 "waitingForLock" : false,
 "lockStats" : {
 "ReplicationStateTransition" : {
 "acquireCount" : {
 "w" : NumberLong(6)
 }
 },
 "Global" : {
 "acquireCount" : {
 "r" : NumberLong(1),
 "w" : NumberLong(4)
 }
 },
 "Database" : {
 "acquireCount" : {
 "r" : NumberLong(1),
 "w" : NumberLong(4)
 }
 },
 "Collection" : {
 "acquireCount" : {
 "r" : NumberLong(1),
 "w" : NumberLong(3),
 "W" : NumberLong(1)
 }
 },
 "Mutex" : {
 "acquireCount" : {
 "r" : NumberLong(4)
 }
 }
 },
 "waitingForFlowControl" : false,
 "flowControlStats" : {
 "acquireCount" : NumberLong(3),
 "timeAcquiringMicros" : NumberLong(4)
 }
 },
 {
 "type" : "op",
 "host" : "HOST:27017",
 "desc" : "IndexBuildsCoordinatorMongod-14",
 "active" : true,
 "currentOpTime" : "2020-09-02T13:52:48.263+00:00",
 "opid" : 9229866,
 "secs_running" : NumberLong(165241),
 "microsecs_running" : NumberLong("165241328641"),
 "op" : "command",
 "ns" : "LIVE_DB_NAME.test",
 "command" : {
 "createIndexes" : "test",
 "indexes" : [
 {
 "v" : 2,
 "key" : {
 "test2" : 1
 },
 "name" : "test2",
 "background" : true
 }
 ],
 "$db" : "LIVE_DB_NAME"
 },
 "msg" : "Index Build: draining writes received during build",
 "numYields" : 0,
 "locks" : {
 
},
 "waitingForLock" : false,
 "lockStats" : {
 "ReplicationStateTransition" : {
 "acquireCount" : {
 "w" : NumberLong(6)
 }
 },
 "Global" : {
 "acquireCount" : {
 "r" : NumberLong(1),
 "w" : NumberLong(4)
 }
 },
 "Database" : {
 "acquireCount" : {
 "r" : NumberLong(1),
 "w" : NumberLong(4)
 }
 },
 "Collection" : {
 "acquireCount" : {
 "r" : NumberLong(1),
 "w" : NumberLong(3),
 "W" : NumberLong(1)
 }
 },
 "Mutex" : {
 "acquireCount" : {
 "r" : NumberLong(4)
 }
 }
 },
 "waitingForFlowControl" : false,
 "flowControlStats" : {
 "acquireCount" : NumberLong(3),
 "timeAcquiringMicros" : NumberLong(2)
 }
 }
 ],
 "ok" : 1,
 "$clusterTime" : {
 "clusterTime" : Timestamp(1599054768, 2),
 "signature" : {
 "hash" : BinData(0,"lixpXYpCBX3K64aoFEfNX2pc/v0="),
 "keyId" : NumberLong("6861636045532823556")
 }
 },
 "operationTime" : Timestamp(1599054768, 2)
}

 

PS: Apologies for a not formatted paste, I cannot format it correctly — visual editor is acting weird.

Comment by Dmitry Agranat [ 02/Sep/20 ]

Hi bobgubko13@gmail.com, I was not able to find any operation with the Index Build: draining writes received during build messages in the uploaded logs. Could you please check again and reattach the logs containing this message?

Comment by Ivan Shcholokov [ 01/Sep/20 ]

Hi Dmitry,

Thanks for replying. Just uploaded two tars. mongod.log is pretty huge since it contains oplog from a live DB, I hope it's not a problem.

Comment by Dmitry Agranat [ 01/Sep/20 ]

Hi bobgubko13@gmail.com,

Would you please archive (tar or zip) the mongod.log files and the $dbpath/diagnostic.data directory (the contents are described here) covering the time of the reported issue and upload them to this support uploader?

Files uploaded to this portal are visible only to MongoDB employees and are routinely deleted after some time.

Thanks,
Dima

Generated at Thu Feb 08 05:23:15 UTC 2024 using Jira 9.7.1#970001-sha1:2222b88b221c4928ef0de3161136cc90c8356a66.