-
Type: Question
-
Resolution: Works as Designed
-
Priority: Major - P3
-
None
-
Affects Version/s: None
-
Component/s: None
-
Labels:None
Good evening,
I am investigating a reducing of our production oplog duration (covered hours) and found that we have a lot of "no-ops" inside. Sometimes the number really match the amount of updates.
We are using replica set and NodeJS ( "mongoose": "^5.7.12" -> "mongodb": "3.4.1").
this is a result of findout
first collection:
{ ns: 'some-collection', size: '416 MB', count: 367828, bytes: 435722363, u: { count: 147101, bytes: 60384983 }, n: { count: 147100, bytes: 250329515 }, i: { count: 73627, bytes: 125007865 } }
second collection:
{ ns: 'another-collection', size: '97 MB', count: 16206, bytes: 101701684, i: { count: 4478, bytes: 36079928 }, n: { count: 5864, bytes: 45363722 }, u: { count: 5688, bytes: 20198018 }, d: { count: 176, bytes: 60016 } }
but in general I would say ~90% of collections are affected based on the data.
The example of two entries in oplog are bellow
first doc:
{ ts: { '$timestamp': { t: 1580140296, i: 23 } }, t: { '$numberLong': '351' }, h: { '$numberLong': '0' }, v: 2, op: 'i', ns: 'some-collection', ui: { '$binary': '4OfAJbeuSyi4ysxD67tynw==', '$type': '04' }, wall: { '$date': '2020-01-27T15:51:36.266+0000' }, lsid: { id: { '$binary': 'gHG8d87RRvqFc3gQ/FOVAA==', '$type': '04' }, uid: { '$binary': 'DMlgA0xGyqJ4q18jqvv/FOTp8G5cBUXlcHgsCEaquQ8=', '$type': '00' } }, txnNumber: { '$numberLong': '4648' }, stmtId: 0, prevOpTime: { ts: { '$timestamp': [Object] }, t: { '$numberLong': '-1' } }, o: { _id: { '$oid': '5e2f0708433ba41c3d6a4234' }, eventId: { '$oid': '5e2f07072f80192f4c054c78' }, __v: 0, contractId: null, createdAt: { '$date': '2020-01-27T15:51:36.264+0000' }, events: [], modifiedAt: { '$date': '2020-01-27T15:51:36.264+0000' }, state: 'processing', templateKey: 'some-templateKey', type: 'email', userId: 123 } }
second doc:
{ ts: { '$timestamp': { t: 1580140297, i: 6 } }, t: { '$numberLong': '351' }, h: { '$numberLong': '0' }, v: 2, op: 'n', ns: 'some-collection', ui: { '$binary': '4OfAJbeuSyi4ysxD67tynw==', '$type': '04' }, wall: { '$date': '2020-01-27T15:51:37.159+0000' }, lsid: { id: { '$binary': 'kKGJlo31Qy+pEd0+XEJU7A==', '$type': '04' }, uid: { '$binary': 'DMlgA0xGyqJ4q18jqvv/FOTp8G5cBUXlcHgsCEaquQ8=', '$type': '00' } }, txnNumber: { '$numberLong': '50210' }, stmtId: 0, prevOpTime: { ts: { '$timestamp': [Object] }, t: { '$numberLong': '-1' } }, o: { _id: { '$oid': '5e2f0708433ba41c3d6a4234' }, eventId: { '$oid': '5e2f07072f80192f4c054c78' }, __v: 0, contractId: null, createdAt: { '$date': '2020-01-27T15:51:36.264+0000' }, events: [], modifiedAt: { '$date': '2020-01-27T15:51:36.264+0000' }, state: 'processing', templateKey: 'some-templateKey', type: 'email', userId: 123 } }
some useful info:
The applications is not idle in general, we have many different processes running in parallel. Current oplog duration (995Mb) is jumping between 9 and 2 hours. We recently updated to Mongo 4.2 and we didn't disable retryable writes. db.serverStatus() .transactions.retriedCommandsCount = 0. We have the same issue not only on production but also on test env. There are several instances of the service deployed on production (in addition in cluster mode) but based on the logs it's not an application concurrency issue. It was really single thread which produce same oplog entries (they even have same _id).
It would be nice to understand what is the real issue here, because oplog duration dropped to so low number.