[CSHARP-412] SlaveOk queries directed to RECOVERING instances Created: 19/Mar/12 Updated: 02/Apr/15 Resolved: 20/Mar/12 |
|
| Status: | Closed |
| Project: | C# Driver |
| Component/s: | None |
| Affects Version/s: | 1.3.1 |
| Fix Version/s: | 1.4 |
| Type: | Bug | Priority: | Major - P3 |
| Reporter: | Aristarkh Zagorodnikov | Assignee: | Robert Stam |
| Resolution: | Done | Votes: | 0 |
| Labels: | None | ||
| Remaining Estimate: | Not Specified | ||
| Time Spent: | Not Specified | ||
| Original Estimate: | Not Specified | ||
| Description |
|
When one of our secondaries is resynced, it appears that C# driver sends slaveOk queries to it while it's still recovering. |
| Comments |
| Comment by Aristarkh Zagorodnikov [ 20/Mar/12 ] |
|
Good to know that you found out the reason, looking forward for the fix =) |
| Comment by Robert Stam [ 20/Mar/12 ] |
|
Thanks so much for repeating the test and providing the additional information. It is very useful. In the reply to isMaster we see: { which we expect, because even though the new member is a secondary the server has been coded to return { secondary : false }when a secondary is in recovering mode. So normally the C# driver would not have sent queries to this secondary until it was done with the initial sync, at which point the server would start returning { secondary : true }. But what's different in this scenario is that this secondary has also been configured as a passive secondary, which we can see from the { passive : true }in the response to isMaster. All drivers consider passive secondaries as eligible to receive slaveOk queries, so my theory is that the C# driver is sending queries to this secondary not because of { secondary : true }, but because of { passive : true }. I've experimented with a local replica set and it looks like isMaster returns { secondary : true }even for passive members (but not until they are out of recovery mode), so it looks like all we need to do is remove the check for IsPassive from the driver when it is choosing which secondary to send a slaveOk query to, since passives will already be eligible due to { secondary : true }. I will try and find out if in earlier versions of the server it was returning { secondary : false }for passives, but I suspect it was not and that it was just an invalid assumption on my part that the algorithm that distributes slaveOk queries to secondaries had to check the IsPassive flag also. |
| Comment by Aristarkh Zagorodnikov [ 20/Mar/12 ] |
|
It appears that it's related to cursors: ). |
| Comment by Aristarkh Zagorodnikov [ 20/Mar/12 ] |
|
Yes, it repeats. |
| Comment by Robert Stam [ 19/Mar/12 ] |
|
Is this a test that you can repeat? I'm curious to know what that replica set member was sending back in response to the isMaster command during the time is was doing the initial sync. It looks like it must have been sending back { secondary: true }the whole time it was doing the initial sync. If that's the case this might be considered a server bug. Alternatively, the driver could follow up the isMaster command with a second call to replSetGetStatus (which is the only way to get the actual status of the replica set member), which would be somewhat detrimental in that it's a second network round trip (but it does only happen once every 10 seconds). I grepped the other drivers and none of them are calling replSetGetStatus. |
| Comment by Aristarkh Zagorodnikov [ 19/Mar/12 ] |
|
I'll give you an extract of the log to give you proper impressions of timings. Mon Mar 19 21:17:45 [initandlisten] journal dir=/mnt/disk3/mongodb/journal ... }, $orderby: { t: -1 } } |
| Comment by Robert Stam [ 19/Mar/12 ] |
|
When a secondary is in recovering state it should be report { secondary: false } in the response to isMaster (see Is this just a timing issue? The members of the replica set are polled once every 10 seconds, so it might take the driver a few seconds to detect that a secondary has gone into recovering mode. Would that explain what you are seeing? |