[SERVER-4174] Replicaset resync skips building of _id index in certain circumstances Created: 29/Oct/11  Updated: 15/Aug/12  Resolved: 09/Mar/12

Status: Closed
Project: Core Server
Component/s: Replication
Affects Version/s: 2.0.1
Fix Version/s: None

Type: Bug Priority: Major - P3
Reporter: Y. Wayne Huang Assignee: Kristina Chodorow (Inactive)
Resolution: Cannot Reproduce Votes: 0
Labels: indexing, replicaset, replication
Remaining Estimate: Not Specified
Time Spent: Not Specified
Original Estimate: Not Specified
Environment:

ubuntu 10.04 x86_64; 4 x replicaset shards, each replicaset with two participating members and one arbiter


Issue Links:
Related
related to SERVER-5040 Cloner can fail to create unique inde... Closed
Operating System: Linux
Participants:

 Description   

We are in the process of resyncing to address fragmentation and to upgrade all indexes to the newer v1 format. This process worked fine for 3 of our 4 shards. The 4th shard is the primary for the namespace and because we have not yet fully tested MR output to a sharded collection, it also stores the output of all MR tasks. None of the indexes on this shard have been upgraded to v1. In 4 resync attempts (remove all data on secondary), we have noticed that the

{ _id: 1 }

index on some collections is not built during the resync process and hypothesize that there is some corruption or incompatibility in the index that appears to leave the index functional for queries but somehow prevents it from being rebuilt during resync attempts. We believe that this impacts replay of the oplog, causing it to seemingly hang--optimeDate does not advance (perhaps it is working but extremely slow). Shutting down the node shows the in-progress operation being interrupted:

Fri Oct 28 16:26:33 [rsSync] replSet syncTail: 11600 interrupted at shutdown, syncing: { ts: Timestamp 1319826732000|30, h: 1028901388370155163, op: "u", ns: "mydb.mycoll", o2: { _id:

{ ... } }, o: { _id: { ... }

, value:

{ ...data... }

} }

The timestamp matches the optimeDate. We have seen this affect both inserts and updates. It appears to only affect collections which are the target for MR output. Additionally, it may only affect collections wherein MR results are re-reduced/re-finalized, but we are less confident about this assessment.



 Comments   
Comment by Kristina Chodorow (Inactive) [ 09/Mar/12 ]

Suspected issue, as _id index is created after clone.

Comment by Kristina Chodorow (Inactive) [ 09/Mar/12 ]

I'm sorry it took me so long to get back to you. We're a bit overloaded and I lost track of things. I'll link this to the bug I think caused the problem and mark as Cannot Reproduce.

Comment by Y. Wayne Huang [ 09/Mar/12 ]

I'm sorry, but I no longer remember enough details about this issue to provide any further information. I suggest that if you mark a bug as 'debugging with submitter' that it doesn't occur months after the issue is reported.

Comment by Kristina Chodorow (Inactive) [ 09/Mar/12 ]

This might be the same as SERVER-5040. It would be very helpful to have the logs from startup to eventual _id index creation.

Comment by Kristina Chodorow (Inactive) [ 19/Jan/12 ]

Sorry for the delay!

Can you send the output of running:

> db.mycoll.getIndexes()
> db.mycoll2.getIndexes()
> db.system.namespaces.find()

Also, if you can try initial syncing with -vvvvv and pasting the logs, it would be very helpful.

Comment by Eliot Horowitz (Inactive) [ 09/Nov/11 ]

Kristina has been out, but should continue working on this tomorrow.

Comment by Y. Wayne Huang [ 08/Nov/11 ]

any updates?

Comment by Y. Wayne Huang [ 01/Nov/11 ]

updates?

Comment by Y. Wayne Huang [ 31/Oct/11 ]

serverStatus:

{
"host" : "mongo-s1-02:27017",
"version" : "2.0.1",
"process" : "mongod",
"uptime" : 134268,
"uptimeEstimate" : 133273,
"localTime" : ISODate("2011-10-31T17:22:48.849Z"),
"globalLock" : {
"totalTime" : 134267874816,
"lockTime" : 10835941188,
"ratio" : 0.08070390033989529,
"currentQueue" :

{ "total" : 0, "readers" : 0, "writers" : 0 }

,
"activeClients" :

{ "total" : 0, "readers" : 0, "writers" : 0 }

},
"mem" :

{ "bits" : 64, "resident" : 4942, "virtual" : 140325, "supported" : true, "mapped" : 69872, "mappedWithJournal" : 139744 }

,
"connections" :

{ "current" : 11, "available" : 15989 }

,
"extra_info" :

{ "note" : "fields vary by platform", "heap_usage_bytes" : 359936, "page_faults" : 3165756 }

,
"indexCounters" : {
"btree" :

{ "accesses" : 149272, "hits" : 148953, "misses" : 319, "resets" : 0, "missRatio" : 0.0021370384264965967 }

},
"backgroundFlushing" :

{ "flushes" : 2237, "total_ms" : 1084727, "average_ms" : 484.9025480554314, "last_ms" : 43, "last_finished" : ISODate("2011-10-31T17:22:01.262Z") }

,
"cursors" :

{ "totalOpen" : 0, "clientCursors_size" : 0, "timedOut" : 0 }

,
"network" :

{ "bytesIn" : 27763509, "bytesOut" : 94335276, "numRequests" : 296096 }

,
"repl" :

{ "setName" : "shard1", "ismaster" : false, "secondary" : true, "hosts" : [ "192.168.43.204:27017", "192.168.42.230:27017" ], "arbiters" : [ "192.168.44.208:27101" ], "primary" : "192.168.42.230:27017", "me" : "192.168.43.204:27017" }

,
"opcountersRepl" :

{ "insert" : 203926, "query" : 0, "update" : 542635, "delete" : 602680, "getmore" : 0, "command" : 2292 }

,
"opcounters" :

{ "insert" : 0, "query" : 3, "update" : 0, "delete" : 0, "getmore" : 0, "command" : 296099 }

,
"asserts" :

{ "regular" : 0, "warning" : 0, "msg" : 0, "user" : 71, "rollovers" : 0 }

,
"writeBacksQueued" : false,
"dur" : {
"commits" : 28,
"journaledMB" : 0.073728,
"writeToDataFilesMB" : 0.023417,
"compression" : 0.9999864368159069,
"commitsInWriteLock" : 0,
"earlyCommits" : 0,
"timeMs" :

{ "dt" : 3037, "prepLogBuffer" : 0, "writeToJournal" : 173, "writeToDataFiles" : 0, "remapPrivateView" : 0 }

},
"ok" : 1
}

Comment by Y. Wayne Huang [ 31/Oct/11 ]

Okay, after a bit more debugging it looks like the missing

{ _id: 1 }

indexes do eventually get built but it occurs a few hours after the sync is complete and the recovering node has already become a secondary. Isn't this problematic because replaying update operations from the oplog will result in a full collection scan without this index? Here is a portion of the log during a full resync (remove all secondary data + restart) showing one of the affected collections:

Sat Oct 29 21:11:39 [rsSync] 413286 objects cloned so far from collection mydb_0.mycoll_0
Sat Oct 29 21:11:39 [rsSync] clone mydb_0.mycoll_0 413311
Sat Oct 29 21:12:39 [rsSync] clone mydb_0.mycoll_0 878591
Sat Oct 29 21:12:44 [rsSync] 879630 objects cloned so far from collection mydb_0.mycoll_0
Sat Oct 29 21:13:42 [rsSync] clone mydb_0.mycoll_0 1428351
Sat Oct 29 21:13:47 [rsSync] 1444653 objects cloned so far from collection mydb_0.mycoll_0
Sat Oct 29 21:14:43 [rsSync] clone mydb_0.mycoll_0 2042367
Sat Oct 29 21:14:53 [rsSync] 2107720 objects cloned so far from collection mydb_0.mycoll_0
Sat Oct 29 21:15:49 [rsSync] clone mydb_0.mycoll_0 2504447
Sat Oct 29 21:15:55 [rsSync] 2514501 objects cloned so far from collection mydb_0.mycoll_0
Sat Oct 29 21:16:49 [rsSync] clone mydb_0.mycoll_0 2836863
Sat Oct 29 21:16:56 [rsSync] 2861465 objects cloned so far from collection mydb_0.mycoll_0
Sat Oct 29 21:17:49 [rsSync] clone mydb_0.mycoll_0 3152127
Sat Oct 29 21:17:57 [rsSync] 3218462 objects cloned so far from collection mydb_0.mycoll_0
Sat Oct 29 21:18:51 [rsSync] clone mydb_0.mycoll_0 3582591
Sat Oct 29 21:18:58 [rsSync] 3655342 objects cloned so far from collection mydb_0.mycoll_0
Sat Oct 29 21:19:51 [rsSync] clone mydb_0.mycoll_0 4004223
Sat Oct 29 21:19:59 [rsSync] 4109113 objects cloned so far from collection mydb_0.mycoll_0
Sat Oct 29 21:20:51 [rsSync] clone mydb_0.mycoll_0 4671487
Sat Oct 29 21:21:01 [rsSync] 4717824 objects cloned so far from collection mydb_0.mycoll_0
Sat Oct 29 21:21:51 [rsSync] clone mydb_0.mycoll_0 5184511
Sat Oct 29 21:22:02 [rsSync] 5200678 objects cloned so far from collection mydb_0.mycoll_0
Sat Oct 29 21:22:51 [rsSync] clone mydb_0.mycoll_0 5612415
Sat Oct 29 21:23:03 [rsSync] 5686707 objects cloned so far from collection mydb_0.mycoll_0
Sat Oct 29 21:23:52 [rsSync] clone mydb_0.mycoll_0 6208511
Sat Oct 29 21:24:06 [rsSync] 6279289 objects cloned so far from collection mydb_0.mycoll_0
Sat Oct 29 22:37:09 [rsSync] build index mydb_0.mycoll_0

{ _id.hid: 1.0 }

...
Sat Oct 29 22:48:24 [rsSync] replSet SECONDARY
...
Sun Oct 30 02:01:41 [rsSync] build index mydb_0.mycoll_0

{ _id: 1 }

For what it's worth, our initial assessment about a corrupt index on the primary is probably incorrect. We tested the behavior after rebuilding indexes in the affected collections and the same behavior occurs (_id index is built a few hours later).

Comment by Eliot Horowitz (Inactive) [ 29/Oct/11 ]

CAn you attach all logs from the secondary that is failing?

Comment by Scott Hernandez (Inactive) [ 29/Oct/11 ]

Can you post db.serverStatus() on the secondaries please?

Comment by Y. Wayne Huang [ 29/Oct/11 ]

On the two affected collections, here is .stats() from the primary:

PRIMARY> db.mycoll.stats()
{
"ns" : "mydb.mycoll",
"count" : 6682237,
"size" : 10865763964,
"avgObjSize" : 1626.0668342053716,
"storageSize" : 16758695904,
"numExtents" : 38,
"nindexes" : 2,
"lastExtentSize" : 2146426864,
"paddingFactor" : 1.4199999999281745,
"flags" : 1,
"totalIndexSize" : 1078650752,
"indexSizes" :

{ "_id.hid_1" : 389104576, "_id_" : 689546176 }

,
"ok" : 1
}
PRIMARY> db.mycoll2.stats()
{
"ns" : "mydb.mycoll2",
"count" : 2411182,
"size" : 22009995704,
"avgObjSize" : 9128.301266349865,
"storageSize" : 65847755120,
"numExtents" : 60,
"nindexes" : 5,
"lastExtentSize" : 1991168256,
"paddingFactor" : 1.2999999999867473,
"flags" : 0,
"totalIndexSize" : 785901648,
"indexSizes" :

{ "_id_" : 234536736, "gid_1_mid_1" : 140439152, "id_1" : 122697232, "sid_1" : 107547104, "gid_1_mid_1_id_-1" : 180681424 }

,
"ok" : 1
}

Generated at Thu Feb 08 03:05:11 UTC 2024 using Jira 9.7.1#970001-sha1:2222b88b221c4928ef0de3161136cc90c8356a66.