[SERVER-17794] Indexing progress status Created: 30/Mar/15  Updated: 14/Apr/16  Resolved: 01/Apr/15

Status: Closed
Project: Core Server
Component/s: Index Maintenance
Affects Version/s: None
Fix Version/s: None

Type: Question Priority: Minor - P4
Reporter: vadyalex Assignee: Ramon Fernandez Marina
Resolution: Done Votes: 0
Labels: None
Remaining Estimate: Not Specified
Time Spent: Not Specified
Original Estimate: Not Specified

Participants:

 Description   

I am facing such problem.

I do nightly aggregation jobs that are creating new data and push it to production replica set.
New data should completely replace currently available data.

To achieve that I push new data to new collection, say stage_my_collection, run indexing in background (to not disturb whole replicaset cluster since it is fully operational) as soon as indexing is done I do rename-drop to make collection available for our services:

db.stage_my_collection.renameCollection( "my_collection", true )

The services only read data from target collection but they read it from primary as well as slaves.

The problem is I do not have good mechanism to track indexing progress. db.collection.getIndexes() returns indexes as available even though indexing process is in progress.
As soon as primary node completes background indexing slaves starts it asynchronously.

To track progress I made a simple program that connects to each node in replicaset and checks current operation. But the information was misleading.
Then I tried to check the log output by issuing *

{ getLog: "global" }

* on admin database.

That provides comprehensive information. So far so good.

But there is a problem.
The log entry contains current progress but does not have any information which index is building:

"2015-03-28T20:06:25.032+0000 [conn1858] \t\tIndex Build(background): 1981700/2071558\t95%",
"2015-03-28T20:06:28.022+0000 [conn1858] \t\tIndex Build(background): 1988600/2071558\t95%",
"2015-03-28T20:06:31.068+0000 [conn1858] \t\tIndex Build(background): 1994800/2071558\t96%",
"2015-03-28T20:06:34.045+0000 [conn1858] \t\tIndex Build(background): 2001200/2071558\t96%",
"2015-03-28T20:06:37.049+0000 [conn1858] \t\tIndex Build(background): 2006600/2071558\t96%",
"2015-03-28T20:06:40.016+0000 [conn1858] \t\tIndex Build(background): 2013100/2071558\t97%",
"2015-03-28T20:06:43.021+0000 [conn1858] \t\tIndex Build(background): 2021000/2071558\t97%",
"2015-03-28T20:06:55.004+0000 [conn1858] \t\tIndex Build(background): 2053400/2071558\t99%",
"2015-03-28T20:06:58.006+0000 [conn1858] \t\tIndex Build(background): 2067300/2071558\t99%",
"2015-03-28T20:06:58.823+0000 [conn1858] build index done. scanned 2071558 total records. 844.623 secs",
"2015-03-28T20:06:58.826+0000 [conn1858] command my_database.$cmd command: createIndexes { createIndexes: \"stage_my_collection\", indexes: [ { name: \"index_1_loc_2dsphere\", ns: \"my_database.stage_my_collection\", background: true, key:

Unknown macro: { index}

} ] } keyUpdates:0 numYields:7133 locks(micros) r:35 w:1643440483 reslen:113 844627ms",

But I do have fail signal if any:

"2015-03-28T19:26:23.301+0000 [conn1726] index build failed. spec: { v: 1, key:

Unknown macro: { index}

, name: \"index_1_loc_2dsphere\", ns: \"my_database.stage_my_collection\", background: true, 2dsphereIndexVersion: 2 } error: 11601 operation was interrupted",
"2015-03-28T19:26:23.303+0000 [conn1726] command my_database.$cmd command: createIndexes { createIndexes: \"stage_my_collection\", indexes: [ { name: \"index_1_loc_2dsphere\", ns: \"my_database.stage_my_collection\", background: true, key:

Unknown macro: { index}

} ] } keyUpdates:0 numYields:16 locks(micros) r:50 w:3467543 reslen:140 1765ms",

Is there a better way for me to track indexing progress for each index?
Is it possible to add index uid (databaseName.collectionName.indexName) to the log output in future versions?



 Comments   
Comment by Ramon Fernandez Marina [ 01/Apr/15 ]

For the progress one can use db.currentOp(), even when there are multiple indexes being built in the background. I created a collection with 20M documents of this shape:

> db.foo.findOne()
{
        "_id" : ObjectId("551974fae7f8cb41736eb1dc"),
        "s" : "yQJolF7PvfKJnGcbrupx",
        "n" : 20
}

I then started two index builds in the background, and this is what I see:

> db.currentOp()
{
        "inprog" : [
                {
                        "desc" : "conn20",
                        "threadId" : "0x4333d40",
                        "connectionId" : 20,
                        "opid" : 40020109,
                        "active" : true,
                        "secs_running" : 12,
                        "microsecs_running" : NumberLong(12635820),
                        "op" : "query",
                        "ns" : "test.$cmd",
                        "query" : {
                                "createIndexes" : "foo",
                                "indexes" : [
                                        {
                                                "key" : {
                                                        "s" : 1,
                                                        "x" : 1
                                                },
                                                "name" : "s_1_x_1",
                                                "background" : true
                                        }
                                ]
                        },
                        "client" : "127.0.0.1:33612",
                        "msg" : "Index Build (background) Index Build (background): 1571509/20000000 7%",
                        "progress" : {
                                "done" : 1571509,
                                "total" : 20000000
                        },
                        "numYields" : 12278,
                        "locks" : {
                                "Global" : "w",
                                "Database" : "w",
                                "Collection" : "w"
                        },
                        "waitingForLock" : false,
                        "lockStats" : {
                                "Global" : {
                                        "acquireCount" : {
                                                "w" : NumberLong(12279)
                                        }
                                },
                                "Database" : {
                                        "acquireCount" : {
                                                "w" : NumberLong(12279),
                                                "W" : NumberLong(1)
                                        },
                                        "acquireWaitCount" : {
                                                "w" : NumberLong(1)
                                        },
                                        "timeAcquiringMicros" : {
                                                "w" : NumberLong(44926)
                                        }
                                },
                                "Collection" : {
                                        "acquireCount" : {
                                                "w" : NumberLong(12279)
                                        }
                                }
                        }
                },
                {
                        "desc" : "conn19",
                        "threadId" : "0x4334560",
                        "connectionId" : 19,
                        "opid" : 40020111,
                        "active" : true,
                        "secs_running" : 11,
                        "microsecs_running" : NumberLong(11594372),
                        "op" : "query",
                        "ns" : "test.$cmd",
                        "query" : {
                                "createIndexes" : "foo",
                                "indexes" : [
                                        {
                                                "key" : {
                                                        "s" : 1
                                                },
                                                "name" : "s_1",
                                                "background" : true
                                        }
                                ]
                        },
                        "client" : "127.0.0.1:33571",
                        "msg" : "Index Build (background) Index Build (background): 1438508/20000000 7%",
                        "progress" : {
                                "done" : 1438508,
                                "total" : 20000000
                        },
                        "numYields" : 11238,
                        "locks" : {
                                "Global" : "w",
                                "Database" : "w",
                                "Collection" : "w"
                        },
                        "waitingForLock" : false,
                        "lockStats" : {
                                "Global" : {
                                        "acquireCount" : {
                                                "w" : NumberLong(11239)
                                        }
                                },
                                "Database" : {
                                        "acquireCount" : {
                                                "w" : NumberLong(11239),
                                                "W" : NumberLong(1)
                                        },
                                        "acquireWaitCount" : {
                                                "W" : NumberLong(1)
                                        },
                                        "timeAcquiringMicros" : {
                                                "W" : NumberLong(438)
                                        }
                                },
                                "Collection" : {
                                        "acquireCount" : {
                                                "w" : NumberLong(11239)
                                        }
                                }
                        }
                }
        ]
}

Index build startup times and failures will be recorded in the logs, so while it may be inconvenient to look in two places I think all the information is there. There's currently no mechanism that centralizes this information for a replica set or a sharded cluster – one must check on each and every node.

Please note that system.indexes is an internal system collection, and any changes to it by users may trigger unexpected behavior so I highly recommend you don't use it directly.

Since this report is not about a bug in the MongoDB server I'm going to resolve this ticket. For MongoDB-related support discussion please post on the mongodb-user group or Stack Overflow with the mongodb tag, where your question will reach a larger audience. A question like this involving more discussion would be best posted on the mongodb-user group.

Regards,
Ramón.

Comment by vadyalex [ 31/Mar/15 ]

I have been digging through documentation thinking a bit about my root problem (synchronizing indexing completion across whole replica set) I found an interesting observation: one can create index by do insert of the index description document into collection.system.indexes collection; if insert operation has WriteConsern(3), corresponding to the number of members in the replicaset, program thread will be blocked until all three instances acknowledge indexing process completion; This is perfect. But the indexing process goes in foreground which blocks all other operation on the whole replica set during indexing.
Adding background: true to the index description will block program thread until indexing progress is complete and acknowledged only by primary node, ignoring WriteConsern(3) directive.

Is it by design? Or is it kind of bug?

Comment by vadyalex [ 30/Mar/15 ]

Ramon Fernandez,

If I understand correctly since MongoDB 2.6 it is possible to build several indexes in the background then how to differentiate which progress corresponds to which progress status?

When I meant misleading I probably mean that it is not suitable to track whether indexing process is complete or failed.
Plus there are two different operation types on the primary as createIndexes and replicaset insert into system.indexes on slaves.

Comment by Ramon Fernandez Marina [ 30/Mar/15 ]

vadyalex, the index name is logged at the beginning and end of the operation; for example:

2015-03-30T12:15:52.861-0400 I INDEX    [conn3] build index on: test.foo properties: { v: 1, key: { s: 1.0, n: 1.0 }, name: "s_1_n_1", ns: "test.foo", background: true }
2015-03-30T12:15:55.003-0400 I -        [conn3]   Index Build (background): 263600/20000000 1%
2015-03-30T12:15:58.003-0400 I -        [conn3]   Index Build (background): 617200/20000000 3%
2015-03-30T12:16:01.003-0400 I -        [conn3]   Index Build (background): 988200/20000000 4%
...
2015-03-30T12:18:52.003-0400 I -        [conn3]   Index Build (background): 19432900/20000000 97%
2015-03-30T12:18:55.003-0400 I -        [conn3]   Index Build (background): 19764400/20000000 98%
2015-03-30T12:18:57.130-0400 I INDEX    [conn3] build index done.  scanned 20000000 total records. 184 secs
2015-03-30T12:18:57.130-0400 I COMMAND  [conn3] command test.$cmd command: createIndexes { createIndexes: "foo", indexes: [ { key: { s: 1.0, n: 1.0 }, name: "s_1_n_1", background: true } ] } keyUpdates:0 writeConflicts:0 numYields:156328 reslen:113 locks:{} 184276ms

Also db.currentOp() does show progress information:

> db.currentOp()
{ 
        "inprog" : [
                {
                        "desc" : "conn3",
                        "threadId" : "0x4333520",
                        "connectionId" : 3,
                        "opid" : 40020067,
                        "active" : true,
                        "secs_running" : 12,
                        "microsecs_running" : NumberLong(12147884),
                        "op" : "query",
                        "ns" : "test.$cmd",
                        "query" : {
                                "createIndexes" : "foo",
                                "indexes" : [
                                        {
                                                "key" : {
                                                        "s" : 1,
                                                        "n" : 1
                                                },
                                                "name" : "s_1_n_1",
                                                "background" : true
                                        }
                                ]
                        },
                        "client" : "127.0.0.1:58032",
                        "msg" : "Index Build (background) Index Build (background): 1489382/20000000 7%",
                        "progress" : {
                                "done" : 1489382,
                                "total" : 20000000
                        },
...

When you say above that "the information was misleading", what issues were you encountering?

Generated at Thu Feb 08 03:45:35 UTC 2024 using Jira 9.7.1#970001-sha1:2222b88b221c4928ef0de3161136cc90c8356a66.