Uploaded image for project: 'PHP Driver: Extension'
  1. PHP Driver: Extension
  2. PHPC-1523

findAndModify generates 50-100x more oplog traffic with retryWrites=true

    XMLWordPrintable

Details

    • Bug
    • Status: Closed
    • Major - P3
    • Resolution: Works as Designed
    • 1.6.0, 1.6.1
    • None
    • None
    • None
    • Ubuntu 18 / x64
      Mongo server 4.0.14

    Description

      We noticed our oplogs on some clusters were spending orders of magnitude more GB/hour despite similar traffic and lower scale; all are on the same mongod version, using same host OS platform, and all were similar hardware, but we found that the clusters that were impacted were all running the PHP Extension 1.6.0 and the cluster that was fine was still on 1.5.5.

      As an experiment we upgraded the known-good cluster to 1.6.1 and the change to the op-log was dramatic – we went from under 1GB/hr to close to 100GB/hr before we hastily down-graded. See attached screen-capture from the monitoring software.

      Convinced we had found the difference we down-graded the 1.6.0 clusters as well and saw the average GB/hr slowly start to dwindle; I think it'll take quite some time for that average to settle so posting a picture won't be as exciting just yet.

      Preliminary investigation suggested that the oplog was being flooded with "no-op" records containing the entire document updated:

      {
          "ts" : Timestamp(1578321176, 5387),
          "t" : NumberLong(7),
          "h" : NumberLong("6027878811455725704"),
          "v" : 2,
          "op" : "n",
          "ns" : "lotus.accounts",
          "ui" : UUID("30e2da0f-6bbc-49e0-8578-f681ce6a6bfa"),
          "wall" : ISODate("2020-01-06T14:32:56.689Z"),
          "lsid" : {
              "id" : UUID("53985818-7a52-4ba3-8545-dc00bb6d240b"),
              "uid" : BinData(0,"47DEQpj8HBSa+/TImW+5JCeuQeRkm5NMpJWZG3hSuFU=")
          },
          "txnNumber" : NumberLong(312101),
          "stmtId" : 0,
          "prevOpTime" : {
              "ts" : Timestamp(0, 0),
              "t" : NumberLong(-1)
          },
          "o" : { ..... gigantic json block ..... }
      }
      

      I can't exactly send you the oplog because it has client-data in it but since I have 300GB of backlog that should be written at a few hundred mb/hr I might be able to fish for clues before it expires.

      Attachments

        1. 1.6.1.png
          1.6.1.png
          18 kB
        2. PS4.png
          PS4.png
          22 kB

        Issue Links

          Activity

            People

              jmikola@mongodb.com Jeremy Mikola
              glen.miner Glen Miner
              Votes:
              0 Vote for this issue
              Watchers:
              3 Start watching this issue

              Dates

                Created:
                Updated:
                Resolved: