Uploaded image for project: 'Core Server'
  1. Core Server
  2. SERVER-5286

DBClientBase::findN: transport error for query: { mapreduce.shardedfinish: {....} }

    XMLWordPrintable

    Details

    • Type: Bug
    • Status: Closed
    • Priority: Major - P3
    • Resolution: Works as Designed
    • Affects Version/s: 2.0.3
    • Fix Version/s: None
    • Component/s: MapReduce, Sharding
    • Labels:
    • Environment:
      ubuntu 10.04 x86_64
    • Operating System:
      Linux

      Description

      we are running an unauthenticated sharded setup with 4 shards, each a replicaset of 2 members and an arbiter. all nodes are running 2.0.3. periodically, mapreduce jobs fail with:

      Array
      (
      [assertion] => DBClientBase::findN: transport error: mongo-s1-01:27011 query: { mapreduce.shardedfinish: { mapreduce: "PageView", map: CodeWScope( function()

      { ... }

      , {}), reduce: CodeWScope(
      function(k, vals)

      { ... }

      , {}), query: { date:

      { $gt: 1330944522 }

      , content_type:

      { $in: [ "text/html; charset=utf-8", "text/html" ] }

      }, out: "mr.PageView.1331545722.6649.700620" }, shardedOutputCollection: "tmp.mrs.PageView_1331545723_8", shards: { shard2/mongo-s2-02:27022,mongo-s2-01:27021: { result: "tmp.mrs.PageView_1331545723_8", timeMillis: 7056, counts:

      { input: 146527, emit: 35517, reduce: 4562, output: 1391 }

      , ok: 1.0 }, shard3/mongo-s3-01:27017,mongo-s3-02:27017: { result: "tmp.mrs.PageView_1331545723_8", timeMillis: 2127, counts:

      { input: 37213, emit: 34834, reduce: 4460, output: 1351 }

      , ok: 1.0 }, shard4/mongo-s4-02:27017,mongo-s4-03:27017,mongo-s4-01:27017: { result: "tmp.mrs.PageView_1331545723_8", timeMillis: 3031, counts:

      { input: 58947, emit: 51909, reduce: 5721, output: 2096 }

      , ok: 1.0 } }, shardCounts: { shard2/mongo-s2-02:27022,mongo-s2-01:27021:

      { input: 146527, emit: 35517, reduce: 4562, output: 1391 }

      , shard3/mongo-s3-01:27017,mongo-s3-02:27017:

      { input: 37213, emit: 34834, reduce: 4460, output: 1351 }

      , shard4/mongo-s4-02:27017,mongo-s4-03:27017,mongo-s4-01:27017:

      { input: 58947, emit: 51909, reduce: 5721, output: 2096 }

      }, counts:

      { emit: 122260, input: 242687, output: 4838, reduce: 14743 }

      }
      [assertionCode] => 10276
      [errmsg] => db assertion failure
      [ok] => 0
      )

      this happens periodically. running flushRouterConfig ahead of the MR job does not resolve this issue. bouncing mongos does not resolve the issue. in back to back runs, it fails about 4 out of every 5 tries with 1 success despite no changes on our end.

        Attachments

          Issue Links

            Activity

              People

              Assignee:
              renctan Randolph Tan
              Reporter:
              wayne530 Y. Wayne Huang
              Participants:
              Votes:
              2 Vote for this issue
              Watchers:
              6 Start watching this issue

                Dates

                Created:
                Updated:
                Resolved: