Uploaded image for project: 'Core Server'
  1. Core Server
  2. SERVER-24114

getMore on oplog does not adhere to maxTimeMS

    • Type: Icon: Bug Bug
    • Resolution: Won't Fix
    • Priority: Icon: Major - P3 Major - P3
    • None
    • Affects Version/s: None
    • Component/s: Querying, Replication
    • Labels:
      None
    • Replication
    • ALL
    • 0

      In a patch build, a getMore that the fetcher executed on the primary was given a maxTimeMS of 5 seconds. After 10 seconds the command failed with a user assertion that led to an ExceededTimeLimit error. maxTimeMS failures on a getMore should lead to an empty batch and no error.

      [js_test:maxSyncSourceLagSecs] 2016-05-05T22:17:15.600+0000 d20010| 2016-05-05T22:17:15.501+0000 I COMMAND  [conn10] command local.oplog.rs command: find { find: "oplog.rs", filter: { ts: { $gte: Timestamp 1462486622000|1 } }, tailable: true, oplogReplay: true, awaitData: true, maxTimeMS: 60000, term: 1 } planSummary: COLLSCAN cursorid:27336043991 keysExamined:0 docsExamined:4 numYields:0 nreturned:4 reslen:708 locks:{ Global: { acquireCount: { r: 2 } }, MMAPV1Journal: { acquireCount: { r: 1 } }, Database: { acquireCount: { r: 1 } }, oplog: { acquireCount: { R: 1 } } } protocol:op_command 0ms
      ...
      [js_test:maxSyncSourceLagSecs] 2016-05-05T22:17:26.219+0000 d20010| 2016-05-05T22:17:26.219+0000 D -        [conn10] User Assertion: 50:operation exceeded time limit
      [js_test:maxSyncSourceLagSecs] 2016-05-05T22:17:26.221+0000 d20010| 2016-05-05T22:17:26.219+0000 I COMMAND  [conn10] command local.oplog.rs command: getMore { getMore: 27336043991, collection: "oplog.rs", maxTimeMS: 5000, term: 1, lastKnownCommittedOpTime: { ts: Timestamp 1462486635000|2, t: 1 } } planSummary: COLLSCAN cursorid:27336043991 keysExamined:0 docsExamined:0 numYields:0 nreturned:0 reslen:292 locks:{ Global: { acquireCount: { r: 4 } }, MMAPV1Journal: { acquireCount: { r: 2 } }, Database: { acquireCount: { r: 2 } }, oplog: { acquireCount: { R: 2 } } } protocol:op_command 10002ms
      [js_test:maxSyncSourceLagSecs] 2016-05-05T22:17:26.222+0000 d20011| 2016-05-05T22:17:26.219+0000 D ASIO     [NetworkInterfaceASIO-BGSync-0] Request 47 timed out, adjusted timeout after getting connection from pool was 10000ms, op was id: 3, states: [ UNINITIALIZED, IN_PROGRESS ], start_time: 2016-05-05T22:17:16.217+0000, request: RemoteCommand 47 -- target:ip-10-168-148-139:20010 db:local expDate:2016-05-05T22:17:26.217+0000 cmd:{ getMore: 27336043991, collection: "oplog.rs", maxTimeMS: 5000, term: 1, lastKnownCommittedOpTime: { ts: Timestamp 1462486635000|2, t: 1 } }
      [js_test:maxSyncSourceLagSecs] 2016-05-05T22:17:26.223+0000 d20011| 2016-05-05T22:17:26.220+0000 D ASIO     [NetworkInterfaceASIO-BGSync-0] Operation timing out; original request was: RemoteCommand 47 -- target:ip-10-168-148-139:20010 db:local expDate:2016-05-05T22:17:26.217+0000 cmd:{ getMore: 27336043991, collection: "oplog.rs", maxTimeMS: 5000, term: 1, lastKnownCommittedOpTime: { ts: Timestamp 1462486635000|2, t: 1 } }
      [js_test:maxSyncSourceLagSecs] 2016-05-05T22:17:26.224+0000 d20011| 2016-05-05T22:17:26.220+0000 D ASIO     [NetworkInterfaceASIO-BGSync-0] Failed to execute command: RemoteCommand 47 -- target:ip-10-168-148-139:20010 db:local expDate:2016-05-05T22:17:26.217+0000 cmd:{ getMore: 27336043991, collection: "oplog.rs", maxTimeMS: 5000, term: 1, lastKnownCommittedOpTime: { ts: Timestamp 1462486635000|2, t: 1 } } reason: ExceededTimeLimit: Operation timed out, request was RemoteCommand 47 -- target:ip-10-168-148-139:20010 db:local expDate:2016-05-05T22:17:26.217+0000 cmd:{ getMore: 27336043991, collection: "oplog.rs", maxTimeMS: 5000, term: 1, lastKnownCommittedOpTime: { ts: Timestamp 1462486635000|2, t: 1 } }
      [js_test:maxSyncSourceLagSecs] 2016-05-05T22:17:26.226+0000 d20011| 2016-05-05T22:17:26.220+0000 D ASIO     [NetworkInterfaceASIO-BGSync-0] Request 47 finished with response: ExceededTimeLimit: Operation timed out, request was RemoteCommand 47 -- target:ip-10-168-148-139:20010 db:local expDate:2016-05-05T22:17:26.217+0000 cmd:{ getMore: 27336043991, collection: "oplog.rs", maxTimeMS: 5000, term: 1, lastKnownCommittedOpTime: { ts: Timestamp 1462486635000|2, t: 1 } }
      [js_test:maxSyncSourceLagSecs] 2016-05-05T22:17:26.227+0000 d20011| 2016-05-05T22:17:26.220+0000 D EXECUTOR [NetworkInterfaceASIO-BGSync-0] Received remote response: ExceededTimeLimit: Operation timed out, request was RemoteCommand 47 -- target:ip-10-168-148-139:20010 db:local expDate:2016-05-05T22:17:26.217+0000 cmd:{ getMore: 27336043991, collection: "oplog.rs", maxTimeMS: 5000, term: 1, lastKnownCommittedOpTime: { ts: Timestamp 1462486635000|2, t: 1 } }
      [js_test:maxSyncSourceLagSecs] 2016-05-05T22:17:26.228+0000 d20011| 2016-05-05T22:17:26.220+0000 D REPL     [NetworkInterfaceASIO-SyncSourceFeedback-0] Reporter sending slave oplog progress to upstream updater ip-10-168-148-139:20010: { replSetUpdatePosition: 1, optimes: [ { durableOpTime: { ts: Timestamp 1462486622000|1, t: -1 }, appliedOpTime: { ts: Timestamp 1462486622000|1, t: -1 }, memberId: 0, cfgver: 1 }, { durableOpTime: { ts: Timestamp 1462486635000|2, t: 1 }, appliedOpTime: { ts: Timestamp 1462486635000|2, t: 1 }, memberId: 1, cfgver: 1 }, { durableOpTime: { ts: Timestamp 1462486635000|2, t: 1 }, appliedOpTime: { ts: Timestamp 1462486635000|2, t: 1 }, memberId: 2, cfgver: 1 } ] }
      [js_test:maxSyncSourceLagSecs] 2016-05-05T22:17:26.229+0000 d20011| 2016-05-05T22:17:26.220+0000 D EXECUTOR [NetworkInterfaceASIO-SyncSourceFeedback-0] Scheduling remote command request: RemoteCommand 63 -- target:ip-10-168-148-139:20010 db:admin cmd:{ replSetUpdatePosition: 1, optimes: [ { durableOpTime: { ts: Timestamp 1462486622000|1, t: -1 }, appliedOpTime: { ts: Timestamp 1462486622000|1, t: -1 }, memberId: 0, cfgver: 1 }, { durableOpTime: { ts: Timestamp 1462486635000|2, t: 1 }, appliedOpTime: { ts: Timestamp 1462486635000|2, t: 1 }, memberId: 1, cfgver: 1 }, { durableOpTime: { ts: Timestamp 1462486635000|2, t: 1 }, appliedOpTime: { ts: Timestamp 1462486635000|2, t: 1 }, memberId: 2, cfgver: 1 } ] }
      [js_test:maxSyncSourceLagSecs] 2016-05-05T22:17:26.230+0000 d20011| 2016-05-05T22:17:26.220+0000 D ASIO     [NetworkInterfaceASIO-SyncSourceFeedback-0] startCommand: RemoteCommand 63 -- target:ip-10-168-148-139:20010 db:admin cmd:{ replSetUpdatePosition: 1, optimes: [ { durableOpTime: { ts: Timestamp 1462486622000|1, t: -1 }, appliedOpTime: { ts: Timestamp 1462486622000|1, t: -1 }, memberId: 0, cfgver: 1 }, { durableOpTime: { ts: Timestamp 1462486635000|2, t: 1 }, appliedOpTime: { ts: Timestamp 1462486635000|2, t: 1 }, memberId: 1, cfgver: 1 }, { durableOpTime: { ts: Timestamp 1462486635000|2, t: 1 }, appliedOpTime: { ts: Timestamp 1462486635000|2, t: 1 }, memberId: 2, cfgver: 1 } ] }
      [js_test:maxSyncSourceLagSecs] 2016-05-05T22:17:26.231+0000 d20011| 2016-05-05T22:17:26.220+0000 D ASIO     [NetworkInterfaceASIO-SyncSourceFeedback-0] Initiating asynchronous command: RemoteCommand 63 -- target:ip-10-168-148-139:20010 db:admin cmd:{ replSetUpdatePosition: 1, optimes: [ { durableOpTime: { ts: Timestamp 1462486622000|1, t: -1 }, appliedOpTime: { ts: Timestamp 1462486622000|1, t: -1 }, memberId: 0, cfgver: 1 }, { durableOpTime: { ts: Timestamp 1462486635000|2, t: 1 }, appliedOpTime: { ts: Timestamp 1462486635000|2, t: 1 }, memberId: 1, cfgver: 1 }, { durableOpTime: { ts: Timestamp 1462486635000|2, t: 1 }, appliedOpTime: { ts: Timestamp 1462486635000|2, t: 1 }, memberId: 2, cfgver: 1 } ] }
      [js_test:maxSyncSourceLagSecs] 2016-05-05T22:17:26.232+0000 d20011| 2016-05-05T22:17:26.220+0000 D ASIO     [NetworkInterfaceASIO-SyncSourceFeedback-0] Starting asynchronous command 63 on host ip-10-168-148-139:20010
      [js_test:maxSyncSourceLagSecs] 2016-05-05T22:17:26.233+0000 d20011| 2016-05-05T22:17:26.220+0000 D EXECUTOR [rsBackgroundSync-0] Executing a task on behalf of pool rsBackgroundSync
      [js_test:maxSyncSourceLagSecs] 2016-05-05T22:17:26.233+0000 d20011| 2016-05-05T22:17:26.220+0000 D REPL     [rsBackgroundSync-0] Error returned from oplog query: ExceededTimeLimit: Operation timed out, request was RemoteCommand 47 -- target:ip-10-168-148-139:20010 db:local expDate:2016-05-05T22:17:26.217+0000 cmd:{ getMore: 27336043991, collection: "oplog.rs", maxTimeMS: 5000, term: 1, lastKnownCommittedOpTime: { ts: Timestamp 1462486635000|2, t: 1 } }
      [js_test:maxSyncSourceLagSecs] 2016-05-05T22:17:26.234+0000 d20011| 2016-05-05T22:17:26.220+0000 D EXECUTOR [rsBackgroundSync-0] waiting for work; I am one of 1 thread(s); the minimum number of threads is 1
      [js_test:maxSyncSourceLagSecs] 2016-05-05T22:17:26.234+0000 d20011| 2016-05-05T22:17:26.221+0000 D ASIO     [NetworkInterfaceASIO-SyncSourceFeedback-0] Request 63 finished with response: { ok: 1.0 }
      [js_test:maxSyncSourceLagSecs] 2016-05-05T22:17:26.234+0000 d20011| 2016-05-05T22:17:26.221+0000 D EXECUTOR [NetworkInterfaceASIO-SyncSourceFeedback-0] Received remote response: RemoteResponse --  cmd:{ ok: 1.0 }
      ...
      [js_test:maxSyncSourceLagSecs] 2016-05-05T22:17:26.238+0000 d20010| 2016-05-05T22:17:26.220+0000 D NETWORK  [conn10] Socket recv() conn closed? 10.168.148.139:60818
      [js_test:maxSyncSourceLagSecs] 2016-05-05T22:17:26.238+0000 d20010| 2016-05-05T22:17:26.220+0000 D NETWORK  [conn10] SocketException: remote: 10.168.148.139:60818 error: 9001 socket exception [CLOSED] server [10.168.148.139:60818]
      [js_test:maxSyncSourceLagSecs] 2016-05-05T22:17:26.238+0000 d20010| 2016-05-05T22:17:26.220+0000 I NETWORK  [conn10] end connection 10.168.148.139:60818 (5 connections now open)
      

        1. getMoreMaxTimeMS.txt
          1.24 MB
          Judah Schvimer

            Assignee:
            backlog-server-repl [DO NOT USE] Backlog - Replication Team
            Reporter:
            judah.schvimer@mongodb.com Judah Schvimer
            Votes:
            2 Vote for this issue
            Watchers:
            20 Start watching this issue

              Created:
              Updated:
              Resolved: