Cross-posting from BF-16430
After doing some digging with spencer.jackson, we believe BF-16430 stems from a race involving kdestroy and kinit which are called from two different tests around the same time. The idea is that one test kinits to generate a ticket, and then before the credentials cache is read, the other test destroys the cache. We had put in place a method of preventing this from happening, but it is not working as intended.
The idea was to use the job's data dir to hold the credentials cache, but it seems the jobs are all using the same data dir. For example, notice how in the following logs:
- kerberos_tool.js is defined as job 1
- ldap_authz_authn.js is defined as job 3
- KRB5CCNAME=DIR:/data/db/job2/mongorunner/krb5
- KRB5CCNAME=DIR:/data/db/job2/mongorunner/krb5
Notice how both of these tests place their credentials cache in the job2 data dir, despite the fact that neither of them are job2. We believe we have isolated the reason for this. On this line in jstest.py, we are making a shallow copy of the shell options dictionary. We later on modify the internal dictionary that contains some of the values that will be passed to the shell to eval. Since this is only a shallow copy, the internal dictionary is not thread-local and refers to the same memory used by the other jobs. We believe this issue can be fixed by using a deepcopy instead of a shallow copy.
As a proof of concept, I have uploaded this patch
- related to
-
SERVER-46837 Add tracing around keytab check in mongokerberos test
- Closed