Uploaded image for project: 'Core Server'
  1. Core Server
  2. SERVER-15049

Database is read-overloaded after upgrading to 2.6.4

    • Type: Icon: Bug Bug
    • Resolution: Duplicate
    • Priority: Icon: Major - P3 Major - P3
    • None
    • Affects Version/s: 2.6.4
    • Component/s: Performance, Querying
    • Labels:
      None
    • Environment:
      Ubuntu 12.04 kernel 3.13.0-32.57~precise1-generic 3.13.11.4
      Intel SSD S3500
    • Linux

      Hello, everyone!
      We have a standard replica set configuration: primary + 2 secondaries.
      After upgrading from 2.6.3 to 2.6.4 strange things happen: new primary gets read-overloaded immediatly after rs.stepdown() from the previous one.
      We have solved the problem temporarily using another rs.stepdown(), but after several hours she's returned. It happened several times, until we downgraded mongodb to 2.6.3.

      Additional information:
      Here's one of the problem queries:

      { 
      	count: "tokens", 
      	query: { _id: -2648398055870562304, $or: [ { t_apn: { $exists: true, $ne: [] } }, { t_gcm: { $exists: true, $ne: [] } } ] }, 
      	limit: 1 
      }
      

      currentOp:

      {
             "opid" : 538972133,
             "active" : true,
             "secs_running" : 12,
             "microsecs_running" : NumberLong(12730587),
             "op" : "query",
             "ns" : "d2_apn.tokens",
             "query" : {
                     "count" : "tokens",
                     "query" : {
                             "_id" : NumberLong("-5091952697439944704"),
                             "$or" : [
                                     {
                                             "t_apn" : {
                                                     "$exists" : true,
                                                     "$ne" : [ ]
                                             }
                                     },
                                     {
                                             "t_gcm" : {
                                                     "$exists" : true,
                                                     "$ne" : [ ]
                                             }
                                     }
                             ]
                     },
                     "limit" : 1
             },
             "planSummary" : "IXSCAN { t_apn: 1 }, IXSCAN { t_gcm: 1 }, IXSCAN { _id: 1 }",
             "client" : "10.3.1.143:63290",
             "desc" : "conn1538217",
             "threadId" : "0x7f5e626b5700",
             "connectionId" : 1538217,
             "locks" : {
                     "^" : "r",
                     "^d2_apn" : "R"
             },
             "waitingForLock" : false,
             "numYields" : 537,
             "lockStats" : {
                     "timeLockedMicros" : {
                             "r" : NumberLong(13578592),
                             "w" : NumberLong(0)
                     },
                     "timeAcquiringMicros" : {
                             "r" : NumberLong(989936),
                             "w" : NumberLong(0)
                     }
             }
      }
      

      dbtop:

      NS             | totalReads  | Writes | Queries | GetMores | Inserts | Updates | Removes
      d2_apn.tokens  | 128 9305.4% | 128 9305.4% | 0 0% | 0 0% | 0 0% | 0 0% | 0 0% | 0 0%
      

      explain:

      db.tokens.find({ _id: -2648398055870562304, $or: [ { t_apn: { $exists: true, $ne: [] } }, { t_gcm: { $exists: true, $ne: [] } } ] }).limit(1).explain()
      {
      	"cursor" : "BtreeCursor _id_",
      	"isMultiKey" : false,
      	"n" : 0,
      	"nscannedObjects" : 1,
      	"nscanned" : 1,
      	"nscannedObjectsAllPlans" : 2,
      	"nscannedAllPlans" : 17,
      	"scanAndOrder" : false,
       	"indexOnly" : false,
       	"nYields" : 0,
       	"nChunkSkips" : 0,
       	"millis" : 1,
       	"indexBounds" : {
       	"_id" : [
       		[
       			-2648398055870562300,
       			-2648398055870562300
       		]
       	]
       	},
       	"server" : "drive2-db1-1:27017",
       	"filterSet" : false
      }
      

      getIndexes:

      [
      	{
       		"v" : 1,
       		"key" : {
       			"_id" : 1
       		},
       		"ns" : "d2_apn.tokens",
       		"name" : "_id_"
       	},
       	{
       		"v" : 1,
       		"key" : {
       			"t_apn" : 1
       		},
       		"ns" : "d2_apn.tokens",
       		"name" : "t_apn_1"
       	},
       	{
       		"v" : 1,
       		"key" : {
       		"t_gcm" : 1
       		},
       		"ns" : "d2_apn.tokens",
       		"name" : "t_gcm_1"
       	}
      ]
      

      stats:

      {
             "ns" : "d2_apn.tokens",
             "count" : 94291,
             "size" : 11569160,
             "avgObjSize" : 122,
             "storageSize" : 22507520,
             "numExtents" : 7,
             "nindexes" : 3,
             "lastExtentSize" : 11325440,
             "paddingFactor" : 1.027000000000001,
             "systemFlags" : 1,
             "userFlags" : 0,
             "totalIndexSize" : 19499760,
             "indexSizes" : {
                     "_id_" : 3875424,
                     "t_apn_1" : 6197408,
                     "t_gcm_1" : 9426928
             },
             "ok" : 1
      }
      

      I've attached some charts from replica set: DB wait times, cpu utilization and load, and disk utilization.
      Important: We've updated mongoDB to version 2.6.4 in ~13-00 on august 18. First rs.stepdown was in ~16-00 and then strange spikes came back in ~08-00, august 20 and in ~10-00 august 21, each with manual rs.stepdown to solved it.

      If you want some additional info and charts – just ask.
      Thanks.

        1. drive2-db1_17-22.09.14.png
          683 kB
          vergil
        2. mongodb-drive2-db1-1.log.final.gz
          126.78 MB
          vergil
        3. mongodb-drive2-db1-2.log.final.gz
          97.66 MB
          vergil
        4. mongodb-drive2-db1-3.log.final.gz
          77.91 MB
          vergil

            Assignee:
            ramon.fernandez@mongodb.com Ramon Fernandez Marina
            Reporter:
            vergil@drive.net Alexander Ivanes
            Votes:
            0 Vote for this issue
            Watchers:
            6 Start watching this issue

              Created:
              Updated:
              Resolved:

                Error rendering 'slack.nextup.jira:slack-integration-plus'. Please contact your Jira administrators.