[SERVER-3830] replSetGetStatus thousands of time per second from mongos Created: 13/Sep/11 Updated: 11/Jul/16 Resolved: 15/Sep/11 |
|
| Status: | Closed |
| Project: | Core Server |
| Component/s: | Replication, Sharding |
| Affects Version/s: | 1.8.3 |
| Fix Version/s: | None |
| Type: | Bug | Priority: | Major - P3 |
| Reporter: | Otto Bergström | Assignee: | Spencer Brody (Inactive) |
| Resolution: | Done | Votes: | 0 |
| Labels: | None | ||
| Remaining Estimate: | Not Specified | ||
| Time Spent: | Not Specified | ||
| Original Estimate: | Not Specified | ||
| Environment: |
Sharded cluster with three shards, each shard is a replica set consisting of two replicas and one arbiter runnning Ubuntu Linux. The Balancer is turned off. |
||
| Attachments: |
|
| Operating System: | Linux |
| Participants: |
| Description |
|
We and the replica originally defined when adding the shard (with the db.runCommand( { addshard : "replicaset/hostname" } ); command) is primary mongos seems to execute a replSetGetStatus for each insert or query it executes. This is an output from mongostat when the primary is the server originally defined when adding the shard in mongos: insert query update delete getmore command This is from mongostat when the other server is primary: insert query update delete getmore command Notice the vast difference in command/s. |
| Comments |
| Comment by Spencer Brody (Inactive) [ 15/Sep/11 ] |
|
Looks like you're hitting |
| Comment by Otto Bergström [ 15/Sep/11 ] |
|
Sorry, here is the rs.status output: , , , |
| Comment by Spencer Brody (Inactive) [ 15/Sep/11 ] |
|
I don't see any calls to replSetGetStatus in the mongos logs - where were you seeing this? Can you also attach the mongod logs from the primary and secondary where you see the problem (if you have them)? |
| Comment by Spencer Brody (Inactive) [ 15/Sep/11 ] |
|
both rs.status.txt and printsharding.txt contain the same output from printShardingStatus(). Do you have the rs.status() output? |
| Comment by Otto Bergström [ 15/Sep/11 ] |
|
I have now attached the information requested, sorry for taking such a long time but i have to wait until the problem appears and to not want to force it since we are in production. |
| Comment by Otto Bergström [ 15/Sep/11 ] |
|
Attached are the outputs requested. The troublesome node is richcolldb06. |
| Comment by Spencer Brody (Inactive) [ 13/Sep/11 ] |
|
Can you please attach: |