Priority: Blocker - P1
Resolution: Cannot Reproduce
Affects Version/s: 2.0.1
Fix Version/s: None
Environment:Linux 2.6.32-35-server, Ubuntu 10.04, MongoDB 2.0.1, Replicaset with 3 Nodes, NUMA, 2x XEON E5620 , 24 GB RAM
What we want to do:
upgrade all indexes to version 2.0 (v:1) in our replicaset
How we do this:
we start with one secondary, shut it down, change port and remove repset param, start it with repair command to reindex all indexes.
after repair reset configuration and connect it to the repset. Wait until the slave is up2date to proceed with the next secondary.
What is the problem:
The secondary is not able to catch up with the master. It has a single process running with 100% cpu usage and almost idle io. (cpu bound)
It falls slowly more and more behind. (all hosts have the same hardware)
What we suspect:
We have a database which has some indexes with the old version and some with the new. If a secondary upgrades the indexes, it has all indexes on the latest version and this locks the replay/resync of the oplog from the master which still has the mixed version indexes.
We downgraded the indexes again with an older mongod binary (1.8.4). After this was finished, we connected the secondary to the replicaset again and it replayed the oplog without a problem and is now in sync again.
All hosts have the mongod binary version 2.0.1.
I've attached iostat and mongostat output. The host with the problem is mn01.
In the mongod.log is no error message just some reoccurring message about the cursor:
Wed Nov 16 06:32:56 [rsSync] repl: old cursor isDead, will initiate a new one