[SERVER-10780] Clock skew and balancing in MONGOS Created: 16/Sep/13 Updated: 10/Dec/14 Resolved: 18/Mar/14 |
|
| Status: | Closed |
| Project: | Core Server |
| Component/s: | None |
| Affects Version/s: | None |
| Fix Version/s: | None |
| Type: | Question | Priority: | Blocker - P1 |
| Reporter: | Somit Srivastava | Assignee: | Unassigned |
| Resolution: | Done | Votes: | 0 |
| Labels: | None | ||
| Remaining Estimate: | Not Specified | ||
| Time Spent: | Not Specified | ||
| Original Estimate: | Not Specified | ||
| Participants: |
| Description |
|
We have around 3 app servers on which we are running mongos which connect to 3 config servers. There was following error in one of the mongos server logs: 1) ""[Balancer] caught exception while doing balance: error checking clock skew of cluster CFG1.hma.com:30000, CFG2.hma.com:30000,CFG3.hma.com:30000 :: caused by :: 13650 clock skew of the cluster CFG1.hma.com:30000, CFG2.hma.com:30000, CFG3.hma.com:30000 is too far out of bounds to allow distributed locking."" This is due to time difference but we have ntpd service running. The difference between time of One working Mongos Server to Non-Working mOngos is around 10sec which i don't think should create this issue. 2)Mon Sep 16 08:07:47.049 [Balancer] distributed lock 'balancer/WEB002:27017:1374748868:1804289383' unlocked. This is coming on one of the mongos server. - Wanted to confirm is balancing works only on one of the mongos server. Also, db.locks.find( { _id : "balancer" } ).pretty() gave following output:- So to summaries all 3 mongos(A, B, C) show different status: In A no log is generated(working fine), In B clock skew issue is coming(not working correctly), In C distributed lock issue is coming. |
| Comments |
| Comment by Stennie Steneker (Inactive) [ 18/Mar/14 ] |
|
Hi Somit, Please be advised I'm closing this issue due to inactivity. Large amounts of clock skew can cause unexpected issues for many programs, particularly if adjustments cause servers to skip back in time. MongoDB has some tolerance for clock skew, but as per the log message you encountered there are sanity checks to keep the skew within reason. If you do encounter a warning on clock skew, the appropriate fix would be to synchronise the server times and ensure ntpd is working correctly. Regards, |
| Comment by Eliot Horowitz (Inactive) [ 30/Nov/13 ] |
|
Is this still an issue? |