[SERVER-17524] MongoDB sharding problem Created: 10/Mar/15 Updated: 15/May/15 Resolved: 15/May/15 |
|
| Status: | Closed |
| Project: | Core Server |
| Component/s: | Internal Code, Sharding |
| Affects Version/s: | 2.2.7 |
| Fix Version/s: | None |
| Type: | Question | Priority: | Major - P3 |
| Reporter: | Girish Bhat | Assignee: | Randolph Tan |
| Resolution: | Done | Votes: | 0 |
| Labels: | None | ||
| Remaining Estimate: | Not Specified | ||
| Time Spent: | Not Specified | ||
| Original Estimate: | Not Specified | ||
| Participants: |
| Description |
|
I have a mongoDB cluster of 3 replication sets Getting this error id_ObjectId('54f9ed416aad853b66d0c21f')", configdb: "172.31.12.107:27019,172.31.12.93:27019,172.31.12.43:27019" } result: { ok: 0.0, errmsg: "Error locking distributed lock for split. :: caused by :: 13651 error checking clock skew of cluster 172.31.12.107:27019,172.31.12.93:27019,172.31.12.. Ntp time on these servers are all synced. output date of all servers mongo-s3: I restarted the cluster keeping ntp up2date , but still same error. |
| Comments |
| Comment by Ramon Fernandez Marina [ 15/May/15 ] | ||||||||||
|
Looks like the issue went away so we're resolving this ticket. If the issue reappears please feel free to reopen. | ||||||||||
| Comment by Girish Bhat [ 11/Mar/15 ] | ||||||||||
|
Hi , When I posted the logs it was not working, I had the same time skew error. | ||||||||||
| Comment by Randolph Tan [ 11/Mar/15 ] | ||||||||||
|
Hi, I don't see the errors from the mongod logs you posted (both logs show successful distributed lock acquisition). Are these the log level 1 logs when the skew exception happened? Thanks! | ||||||||||
| Comment by Girish Bhat [ 11/Mar/15 ] | ||||||||||
|
Hi , Added logs from rs0 primary replication set. For "conn202" FYI, there are 4 shards in the cluster. "rs0" has data and rest of won't . Enabled shards for collection in rs0. | ||||||||||
| Comment by Randolph Tan [ 10/Mar/15 ] | ||||||||||
|
Sorry, I meant to ask for the more verbose log on the primaries. And in particular, the primary where the split was sent to. For example, in the case of the paste bin logs, the verbose logs from the primary of rs0 when the error occurred. Thanks! | ||||||||||
| Comment by Girish Bhat [ 10/Mar/15 ] | ||||||||||
|
| ||||||||||
| Comment by Girish Bhat [ 10/Mar/15 ] | ||||||||||
|
Hi Randolph Tan , The time is same for all config servers. please look into it. OS : CentOS 7.5 (Hosted on Amazon EC2)
Logs pasted below | ||||||||||
| Comment by Randolph Tan [ 10/Mar/15 ] | ||||||||||
|
What platform and OS are running this on? The clock skew checking code uses the localtime from the serverStatus command from each of the config servers for this check. Can you also try increasing the log level to 1? Thanks! |