[SERVER-53812] replsettest.awaitReplication does not work with keyfile authentication Created: 14/Jan/21 Updated: 29/Oct/23 Resolved: 18/Feb/21 |
|
| Status: | Closed |
| Project: | Core Server |
| Component/s: | Testing Infrastructure |
| Affects Version/s: | None |
| Fix Version/s: | 4.9.0 |
| Type: | Bug | Priority: | Major - P3 |
| Reporter: | Mark Benvenuto | Assignee: | Xuerui Fa |
| Resolution: | Fixed | Votes: | 0 |
| Labels: | None | ||
| Remaining Estimate: | Not Specified | ||
| Time Spent: | Not Specified | ||
| Original Estimate: | Not Specified | ||
| Issue Links: |
|
||||||||||||||||||||||||||||||||
| Backwards Compatibility: | Fully Compatible | ||||||||||||||||||||||||||||||||
| Operating System: | ALL | ||||||||||||||||||||||||||||||||
| Steps To Reproduce: |
|
||||||||||||||||||||||||||||||||
| Sprint: | Repl 2021-02-08, Repl 2021-02-22 | ||||||||||||||||||||||||||||||||
| Participants: | |||||||||||||||||||||||||||||||||
| Linked BF Score: | 20 | ||||||||||||||||||||||||||||||||
| Description |
|
replsettest.awaitReplication() does not work when auth is enabled and when using keyfile authentication. It does not work with clusterAuthMode=x509 ( replsettest.stopSet() will also not work. Example:
|
| Comments |
| Comment by Githook User [ 18/Feb/21 ] | ||||||||||||||
|
Author: {'name': 'XueruiFa', 'email': 'xuerui.fa@mongodb.com', 'username': 'XueruiFa'}Message: | ||||||||||||||
| Comment by Xuerui Fa [ 12/Feb/21 ] | ||||||||||||||
|
For 1 and 2, I believe this issue has existed for a long time, and the failure appears to be consistent. I tried running Mark's repro on a compiled version of master from a few months ago, and it still failed. I believe Mark discovered this bug when he was working on another BF fix, prior to that we probably didn't have any tests that tested this exact scenario. I think the minimum fix would be adding asCluster() to each command in awaitReplication(). After adding that however, I found that this error appears if we authenticate first:
As a result, I also modified asCluster to first check if the given connections are authenticated already. This seems to have resolved the issue, I'm currently running an evergreen patch to verify this. | ||||||||||||||
| Comment by Steven Vannelli [ 25/Jan/21 ] | ||||||||||||||
|
Next steps:
| ||||||||||||||
| Comment by Xuerui Fa [ 22/Jan/21 ] | ||||||||||||||
|
In ReplSetTest, we only maintain one connection to each node. For auth tests, this connection has to be authenticated so that commands can be successfully received. We currently authenticate on a command-by-command basis through the asCluster function, which will sign us in as a user, run the command, then log out. This means that for auth tests, we would have to ensure that each command is correctly authenticated. This seems overly complicated, expensive, and difficult to maintain. It seems like the idea proposed SERVER-14017 would be more worthwhile in resolving this problem. We can maintain two connections, one that is authenticated for control operations like awaitReplication(), and another that will handle test operations. Marking this as "Needs Scheduling" for further discussion in Triage. | ||||||||||||||
| Comment by Steven Vannelli [ 21/Jan/21 ] | ||||||||||||||
|
xuerui.fa assigning this to you for BF Friday. Please try to work closely with the Security team on this. |