[SERVER-67068] Entries in oplog of admin.$cmd increase in size Created: 07/Jun/22  Updated: 06/Dec/22  Resolved: 05/Jul/22

Status: Closed
Project: Core Server
Component/s: Replication
Affects Version/s: 4.2.17
Fix Version/s: None

Type: Bug Priority: Major - P3
Reporter: Vladimir Beliakov Assignee: Backlog - Triage Team
Resolution: Done Votes: 0
Labels: None
Remaining Estimate: Not Specified
Time Spent: Not Specified
Original Estimate: Not Specified
Environment:

Ubuntu 16.04
XSF
Kernel - 4.4.0-1128-aws #142-Ubuntu SMP Fri Apr 16 12:42:33 UTC 2021 x86_64 x86_64 x86_64 GNU/Linux
Disable Transparent Huge disabled
AWS m5.xlarge (4cpu\16gb)
SSD GP3 450 Gb
monogo-org-server - 4.2.17


Attachments: Zip Archive diagnostic.data.zip    
Assigned Teams:
Server Triage
Operating System: ALL
Steps To Reproduce:

Unfortunately, I don't know how to reproduce the issue, but it's got to be related to using transactions.

Participants:

 Description   

At some point in time on one of our shards the entries of the collection admin.$cmd got bigger, because of which the oplog size began lowering. We didn't notice that the amount of entries had any increase, only the size.
I guess that somehow was related to using transaction since the entries were for the collections we're using transaction for.
That kept happening until we changed the primary replica on that shard. Right after that the oplog went back to normal.

 

Our cluster configuration:

  • shard cluster with 10 shards
  • four replicas in each shard
  • about 400 GB of data in storage size per shard

Replica server configuration:

  • Ubuntu 16.04
  • XSF
  • Kernel - 4.4.0-1128-aws #142-Ubuntu SMP Fri Apr 16 12:42:33 UTC 2021 x86_64 x86_64 x86_64 GNU/Linux
  • Disable Transparent Huge disabled
  • AWS m5.xlarge (4cpu\16gb)
  • SSD GP3 450 Gb
  • monogo-org-server - 4.2.17

I'm attaching diagnostic.data from the primary where the incident happened.

Incident time:
beginning - 03.06.2022 08:30:00 UTC
end - 03.06.2022 15:45:00 UTC



 Comments   
Comment by Edwin Zhou [ 05/Jul/22 ]

Hi vladimirred456@gmail.com,

Thank you for following up that the issue has been resolved on your end. I will now close this issue as requested.

Best,
Edwin

Comment by Vladimir Beliakov [ 10/Jun/22 ]

We found that the problem was on our side and we were doing the same big transaction over and over again.

Please close the issue.
Sorry for the inconvenience.

Generated at Thu Feb 08 06:07:11 UTC 2024 using Jira 9.7.1#970001-sha1:2222b88b221c4928ef0de3161136cc90c8356a66.