Loading...

XML

Word

Printable

JSON

Type: Bug
Resolution: Done
Priority: Major - P3
Fix Version/s: None
Affects Version/s: 2.0.2
Component/s: Replication
Labels:
- sync
Environment:
Linux 2.6.32-38-server, Ubuntu 10.04, MongoDB 2.0.1, Replicaset with 4 Nodes, NUMA, 2x XEON E5620 , 24 GB RAM

Assigned Teams:

Replication
Operating System:
ALL
Confidence Status:
None
Work Order:
3
CAR Domain/s:
None

Aha! Reference:
None
Tracking Level:
None
Risk Status:
None
Exec Notes:
None
Goal Name(s):
None
Goal Link:
None

This is our Setup
2 Shards, each a ReplicaSet with 4 Nodes. 1 Node is dedicated for backups (priority:0,hidden:true)
If we start a backup we send the backup node the fsyncLock command and then start a rsync of the filesystem.
After we have finished the backup we send the fsyncUnLock command to the backup node.

If we have a master switch (due to upgrade or failure) in the ReplicaSet we encounter the problem that some or all slaves stops oplog syncing when the backup node starts the backup. It is exactly the same moment as we start the fsyncLock command, since the replication lag is the same for the backup nodes and the slaves which also stops syncing. When the backup is finished the other slaves also starts syncing again.
db.currentOp() doesn't show the fsyncLock on the slaves, only on the backup node.
To get rid of this problem we have to start the non backup slave. After this restart the slave runs well and never stop syncing again together with the backup node.

This is the second time we've encoutered this problem. Since this is our production environment we don't want to force a master switch if not needed.

It seems that the cause of this problem is the master switch in the replicaset.

Regards,
Steffen

depends on

SERVER-5208 Replica set periodic reevaluation of sync targets

Closed

Assignee:: [DO NOT USE] Backlog - Replication Team
Reporter:: Steffen
Participants:: [DO NOT USE] Backlog - Replication Team, Eliot Horowitz, Eric Milkie, Gregory McKeon, Kristina Chodorow, Scott Hernandez, Steffen
Votes:: 0 Vote for this issue
Watchers:: 3 Start watching this issue

Created:: Feb 10 2012 01:52:31 PM UTC
Updated:: Dec 06 2022 05:36:33 AM UTC
Resolved:: Feb 22 2018 08:45:02 PM UTC

Details

Description

Attachments

Issue Links

Forms

Activity

People

Dates