[SERVER-33652] Restarting oplog query due to error: Restarting oplog query due to error: OperationFailed: GetMore command executor error Created: 05/Mar/18  Updated: 02/Apr/18  Resolved: 05/Mar/18

Status: Closed
Project: Core Server
Component/s: Replication
Affects Version/s: 3.4.2
Fix Version/s: None

Type: Question Priority: Major - P3
Reporter: kangwanu Assignee: Ramon Fernandez Marina
Resolution: Duplicate Votes: 0
Labels: None
Remaining Estimate: Not Specified
Time Spent: Not Specified
Original Estimate: Not Specified

Issue Links:
Duplicate
duplicates SERVER-32827 Initial sync can fail when syncing a ... Backlog
Participants:

 Description   

secondary have error "Restarting oplog query due to error" after that happen "***aborting after fassert() failure"

Log from secondary member: There are very man these entires:

2018-03-01T07:49:14.167Z I REPL     [replication-7490] Restarting oplog query due to error: OperationFailed: GetMore command executor error: CappedPositionLost: CollectionScan died due to position in capped collection being deleted. Last seen record id: RecordId(6527632502100591112). Last fetched optime (with hash): { ts: Timestamp 1519832877000|519, t: 4 }[-5691563086952961010]. Restarts remaining: 3
2018-03-01T07:49:14.199Z I REPL     [replication-7490] Scheduled new oplog query Fetcher source: xxxxxxxxxx:27018 database: local query:{ find: "oplog.rs", filter: { ts: { $gte: Timestamp 1519832877000|519 } }, tailable: true, oplogReplay: true, awaitData: true, maxTimeMS: 60000, term: 4 } query metadata: { $replData: 1, $ssm: { $secondaryOk: true } } active: 1 timeout: 10000ms shutting down?: 0 first: 1 firstCommandScheduler: RemoteCommandRetryScheduler request: RemoteCommand 370471624 -- target:xxxxxxxxxx:27018 db:local cmd:{ find: "oplog.rs", filter: { ts: { $gte: Timestamp 1519832877000|519 } }, tailable: true, oplogReplay: true, awaitData: true, maxTimeMS: 60000, term: 4 } active: 1 callbackHandle.valid: 1 callbackHandle.cancelled: 0 attempt: 1 retryPolicy: RetryPolicyImpl maxAttempts: 1 maxTimeMillis: -1ms
2018-03-01T07:49:14.258Z I REPL     [rsBackgroundSync] Starting rollback due to OplogStartMissing: our last op time fetched: { ts: Timestamp 1519832877000|519, t: 4 }. source's GTE: { ts: Timestamp 1519832981000|612, t: 4 } hashes: (-5691563086952961010/-1342693826324896861)2018-03-01T07:49:14.268Z I REPL     [rsBackgroundSync] Waiting for all operations from { ts: Timestamp 1519831270000|141, t: 4 } until { ts: Timestamp 1519832877000|519, t: 4 } to be applied before starting rollback.
 
 
2018-03-02T07:10:04.447Z F REPL     [rsBackgroundSync] Unable to complete rollback. A full resync may be needed: UnrecoverableRollbackError: need to rollback, but unable to determine common point between local and remote oplog: NoMatchingDocument: RS100 reached beginning of remote oplog [1] @ 18752
2018-03-02T07:10:04.459Z I -        [rsBackgroundSync] Fatal Assertion 28723 at src/mongo/db/repl/bgsync.cpp 676
2018-03-02T07:10:04.459Z I -        [rsBackgroundSync] 



 Comments   
Comment by kangwanu [ 06/Mar/18 ]

hi Ramon sir
I should Expand optlog right? How much should I Expand ?
+optlog setting +
configured oplog size: 10240MB
log length start to end: 141580secs (39.33hrs)
oplog first event time: Sun Mar 04 2018 19:30:30 GMT+0700 (ICT)
oplog last event time: Tue Mar 06 2018 10:50:10 GMT+0700 (ICT)
now: Tue Mar 06 2018 10:50:11 GMT+0700 (ICT)
thanks.

Comment by Ramon Fernandez Marina [ 05/Mar/18 ]

Sorry to hear you're running into this issue kangwanu. This was reported earlier in SERVER-32827 so I'm going to mark this ticket as a duplicate.

Please see the replication lag documentation to see possible causes of replication lag. If your replication lag is stable, using a larger optlog should help, but it it grows constantly you may need to use more powerful machines and/or a faster network, depending on the exact cause of the lag. You can post MongoDB-related support questions on the mongodb-user group or Stack Overflow with the mongodb tag, where your questions will reach a larger audience.

Regards,
Ramón.

Comment by kangwanu [ 05/Mar/18 ]

More information
1. Add new secondary member.
2. Init sync from master
3. After init sync you should have a replication lag. Sometimes this lag occurs after 2-3 days.
I use mongodb Version/s:3.4.2 .please help me. can you told me. if you want more information
thanks.

Generated at Thu Feb 08 04:34:08 UTC 2024 using Jira 9.7.1#970001-sha1:2222b88b221c4928ef0de3161136cc90c8356a66.