Uploaded image for project: 'Core Server'
  1. Core Server
  2. SERVER-45812

oplog contains a lot of no-ops

    • Type: Icon: Question Question
    • Resolution: Works as Designed
    • Priority: Icon: Major - P3 Major - P3
    • None
    • Affects Version/s: None
    • Component/s: None
    • Labels:
      None

      Good evening,

      I am investigating a reducing of our production oplog duration (covered hours) and found that we have a lot of "no-ops" inside. Sometimes the number really match the amount of updates.

      We are using replica set and NodeJS ( "mongoose": "^5.7.12" -> "mongodb": "3.4.1").

      this is a result of findout

      first collection:

      {
        ns: 'some-collection',
        size: '416 MB',
        count: 367828,
        bytes: 435722363,
        u: { count: 147101, bytes: 60384983 },
        n: { count: 147100, bytes: 250329515 },
        i: { count: 73627, bytes: 125007865 }
      }
      

      second collection:

      {
        ns: 'another-collection',
        size: '97 MB',
        count: 16206,
        bytes: 101701684,
        i: { count: 4478, bytes: 36079928 },
        n: { count: 5864, bytes: 45363722 },
        u: { count: 5688, bytes: 20198018 },
        d: { count: 176, bytes: 60016 }
      }
      

      but in general I would say ~90% of collections are affected based on the data.

      The example of two entries in oplog are bellow

      first doc:

      {
        ts: { '$timestamp': { t: 1580140296, i: 23 } },
        t: { '$numberLong': '351' },
        h: { '$numberLong': '0' },
        v: 2,
        op: 'i',
        ns: 'some-collection',
        ui: { '$binary': '4OfAJbeuSyi4ysxD67tynw==', '$type': '04' },
        wall: { '$date': '2020-01-27T15:51:36.266+0000' },
        lsid: {
          id: { '$binary': 'gHG8d87RRvqFc3gQ/FOVAA==', '$type': '04' },
          uid: {
            '$binary': 'DMlgA0xGyqJ4q18jqvv/FOTp8G5cBUXlcHgsCEaquQ8=',
            '$type': '00'
          }
        },
        txnNumber: { '$numberLong': '4648' },
        stmtId: 0,
        prevOpTime: { ts: { '$timestamp': [Object] }, t: { '$numberLong': '-1' } },
        o: {
          _id: { '$oid': '5e2f0708433ba41c3d6a4234' },
          eventId: { '$oid': '5e2f07072f80192f4c054c78' },
          __v: 0,
          contractId: null,
          createdAt: { '$date': '2020-01-27T15:51:36.264+0000' },
          events: [],
          modifiedAt: { '$date': '2020-01-27T15:51:36.264+0000' },
          state: 'processing',
          templateKey: 'some-templateKey',
          type: 'email',
          userId: 123
        }
      }
      

      second doc:

      {
        ts: { '$timestamp': { t: 1580140297, i: 6 } },
        t: { '$numberLong': '351' },
        h: { '$numberLong': '0' },
        v: 2,
        op: 'n',
        ns: 'some-collection',
        ui: { '$binary': '4OfAJbeuSyi4ysxD67tynw==', '$type': '04' },
        wall: { '$date': '2020-01-27T15:51:37.159+0000' },
        lsid: {
          id: { '$binary': 'kKGJlo31Qy+pEd0+XEJU7A==', '$type': '04' },
          uid: {
            '$binary': 'DMlgA0xGyqJ4q18jqvv/FOTp8G5cBUXlcHgsCEaquQ8=',
            '$type': '00'
          }
        },
        txnNumber: { '$numberLong': '50210' },
        stmtId: 0,
        prevOpTime: { ts: { '$timestamp': [Object] }, t: { '$numberLong': '-1' } },
        o: {
          _id: { '$oid': '5e2f0708433ba41c3d6a4234' },
          eventId: { '$oid': '5e2f07072f80192f4c054c78' },
          __v: 0,
          contractId: null,
          createdAt: { '$date': '2020-01-27T15:51:36.264+0000' },
          events: [],
          modifiedAt: { '$date': '2020-01-27T15:51:36.264+0000' },
          state: 'processing',
          templateKey: 'some-templateKey',
          type: 'email',
          userId: 123
        }
      }
      

      some useful info:
      The applications is not idle in general, we have many different processes running in parallel. Current oplog duration (995Mb) is jumping between 9 and 2 hours. We recently updated to Mongo 4.2 and we didn't disable retryable writes. db.serverStatus() .transactions.retriedCommandsCount = 0. We have the same issue not only on production but also on test env. There are several instances of the service deployed on production (in addition in cluster mode) but based on the logs it's not an application concurrency issue. It was really single thread which produce same oplog entries (they even have same _id).

      It would be nice to understand what is the real issue here, because oplog duration dropped to so low number.

            Assignee:
            schwerin@mongodb.com Andy Schwerin
            Reporter:
            maksim.kondratyuk@gmail.com Maksim Kondratyuk
            Votes:
            0 Vote for this issue
            Watchers:
            7 Start watching this issue

              Created:
              Updated:
              Resolved: