Loading...

XML

Word

Printable

JSON

Type: Bug
Resolution: Done
Priority: Major - P3
Fix Version/s: None
Affects Version/s: None
Component/s: None
Labels:
None

Operating System:
ALL
CAR Domain/s:
None

Aha! Reference:
None
Tracking Level:
None
Risk Status:
None
Exec Notes:
None
Goal Name(s):
None
Goal Link:
None

We have a replica set comprising of a primary, secondary and arbiter. Our database sizes are 20TB+ and we have set our Oplog sizing so that it was 16 hours long the first time our secondary got too far behind. After repriming the secondary and extending the Oplog to 54 hours, we went another 3 weeks before we experienced this issue again over the weekend.

What specific things should we look for at the culprit for this? From documentation, it looks like disk I/O or network issues could potentially cause these issues but I'm not seeing any indication of that so far but really just want to check all of our bases before looking into that any more.

Would sharding out the database help with issues like this? Our database growth keeps climbing and we definitely need to start looking to see if sharding can solve this issue.

I can upload logs too if necessary. Thanks

Assignee:: Edwin Zhou
Reporter:: Neil Allen
Participants:: Edwin Zhou, Neil Allen
Votes:: 0 Vote for this issue
Watchers:: 4 Start watching this issue

Created:: Nov 01 2021 05:34:56 PM UTC
Updated:: Nov 04 2021 09:02:45 PM UTC
Resolved:: Nov 04 2021 09:02:45 PM UTC

Details

Description

Attachments

Activity

People

Dates