[SERVER-25640] Have ReplSetTest run checkDBHashes() in stopSet() Created: 16/Aug/16 Updated: 10/Jun/19 Resolved: 21/Feb/18 |
|
| Status: | Closed |
| Project: | Core Server |
| Component/s: | Testing Infrastructure |
| Affects Version/s: | None |
| Fix Version/s: | 3.6.5, 3.7.3 |
| Type: | Improvement | Priority: | Major - P3 |
| Reporter: | DO NOT USE - Backlog - Test Infrastructure Group (TIG) | Assignee: | Kevin Albertson |
| Resolution: | Done | Votes: | 2 |
| Labels: | None | ||
| Remaining Estimate: | Not Specified | ||
| Time Spent: | Not Specified | ||
| Original Estimate: | Not Specified | ||
| Issue Links: |
|
||||||||||||||||||||||||||||||||||||||||
| Backwards Compatibility: | Minor Change | ||||||||||||||||||||||||||||||||||||||||
| Backport Requested: |
v3.6
|
||||||||||||||||||||||||||||||||||||||||
| Sprint: | TIG 2016-09-19, TIG 2016-10-10, TIG 2016-10-31, TIG 2016-11-21, TIG 2018-1-15, TIG 2018-1-1, TIG 2018-1-29, TIG 2018-02-12 | ||||||||||||||||||||||||||||||||||||||||
| Participants: | |||||||||||||||||||||||||||||||||||||||||
| Linked BF Score: | 50 | ||||||||||||||||||||||||||||||||||||||||
| Description |
|
The checkDBHashes() function is run on replica sets created by resmoke.py fixtures but not on the replica sets started by the ReplSetTest fixture. The checkDBHashes() hook should be run any time a replica set is stopped. |
| Comments |
| Comment by Githook User [ 26/Apr/18 ] | |||||||||||||||||||||||||||
|
Author: {'email': 'kyle.suarez@mongodb.com', 'username': 'ksuarz', 'name': 'Kyle Suarez'}Message: | |||||||||||||||||||||||||||
| Comment by Githook User [ 26/Apr/18 ] | |||||||||||||||||||||||||||
|
Author: {'email': 'kyle.suarez@mongodb.com', 'username': 'ksuarz', 'name': 'Kyle Suarez'}Message: | |||||||||||||||||||||||||||
| Comment by Githook User [ 25/Apr/18 ] | |||||||||||||||||||||||||||
|
Author: {'email': 'kevin.albertson@10gen.com', 'username': 'kevinAlbs', 'name': 'Kevin Albertson'}Message: (cherry picked from commit 8821cdb5508f8a26c025b3e0124903e9ae64c479) | |||||||||||||||||||||||||||
| Comment by Githook User [ 25/Apr/18 ] | |||||||||||||||||||||||||||
|
Author: {'email': 'max.hirschhorn@mongodb.com', 'username': 'visemet', 'name': 'Max Hirschhorn'}Message: | |||||||||||||||||||||||||||
| Comment by Githook User [ 25/Apr/18 ] | |||||||||||||||||||||||||||
|
Author: {'email': 'kevin.albertson@10gen.com', 'username': 'kevinAlbs', 'name': 'Kevin Albertson'}Message: (cherry picked from commit 7f5f161abc0355b71823be7b768075547f30a0cb) | |||||||||||||||||||||||||||
| Comment by Githook User [ 25/Apr/18 ] | |||||||||||||||||||||||||||
|
Author: {'email': 'kevin.albertson@10gen.com', 'username': 'kevinAlbs', 'name': 'Kevin Albertson'}Message: (cherry picked from commit c556e377094792e7253a95cb5fedcd703a99bf2c) | |||||||||||||||||||||||||||
| Comment by Githook User [ 25/Apr/18 ] | |||||||||||||||||||||||||||
|
Author: {'email': 'kevin.albertson@10gen.com', 'username': 'kevinAlbs', 'name': 'Kevin Albertson'}Message: (cherry picked from commit bfc6cfe5fa474a48e35c6430858ae7a26a802e7b) | |||||||||||||||||||||||||||
| Comment by Githook User [ 25/Apr/18 ] | |||||||||||||||||||||||||||
|
Author: {'email': 'kevin.albertson@10gen.com', 'username': 'kevinAlbs', 'name': 'Kevin Albertson'}Message: (cherry picked from commit 608689a72c3a60853590b32f14326acf2d5e28f7) | |||||||||||||||||||||||||||
| Comment by Githook User [ 25/Apr/18 ] | |||||||||||||||||||||||||||
|
Author: {'email': 'kevin.albertson@10gen.com', 'username': 'kevinAlbs', 'name': 'Kevin Albertson'}Message: (cherry picked from commit 46ac85cef193ddcc5e741b27fb4481f7ab7f6b20) | |||||||||||||||||||||||||||
| Comment by Esha Maharishi (Inactive) [ 21/Feb/18 ] | |||||||||||||||||||||||||||
|
Done - filed | |||||||||||||||||||||||||||
| Comment by Max Hirschhorn [ 21/Feb/18 ] | |||||||||||||||||||||||||||
As an outcome of
esha.maharishi, I think your confusion is understandable. The goal of the message was to make it more obvious to the user what the remediation ought to be. Since that message isn't being surfaced clearly enough, we should change the logic in the mongo shell so that it is. I don't see a reason that the mongo shell must use cout for logging the "exiting with a failure due to unterminated processes" message, so we could replace it with a call to severe() instead (and prefix the log message with 'F'). Do you think that would be sufficient for your purposes? Would you mind filing a new SERVER ticket for this improvement request?
Those messages are logged by two different processes (the mongo shell with the former and resmoke.py with the latter) so that isn't really something we'd consider. A related feature in resmoke.py would be to have special handling around certain exit codes from known processes. This case in the mongo shell would be one, but a memory leak detected by ASan/LSan would be another. | |||||||||||||||||||||||||||
| Comment by Esha Maharishi (Inactive) [ 21/Feb/18 ] | |||||||||||||||||||||||||||
|
For example, even just moving the "a call to MongoRunner.stopMongod(), ReplSetTest#stopSet(), or ShardingTest#stop() may be missing from the test" just before/after the "Summary: 1 test(s) ran in 35.86 seconds (0 succeeded, 0 were skipped, 1 failed, 0 errored)" could help. | |||||||||||||||||||||||||||
| Comment by Esha Maharishi (Inactive) [ 21/Feb/18 ] | |||||||||||||||||||||||||||
|
max.hirschhorn kevin.albertson, I noticed that failing to shut down a ShardingTest/ReplSetTest doesn't cause the test to log a "failed to load" line or a javascript stack trace (which makes sense, since which line would you error on?). The line that is logged ("a call to MongoRunner.stopMongod(), ReplSetTest#stopSet(), or ShardingTest#stop() may be missing from the test") also isn't/can't be logged at LogSeverity::Error, since it's not logged by a server process (and which makes the log line contain " E ", which is another thing I typically look for when a test fails without "failed to load"). It took some confusion and additional scrolling through the logs for me to realize why my new test was reporting failure when it seemed like the test ran to completion successfully. Just a thought, in case there's something that can be done to make this failure easier to detect. Snippet from my logs:
| |||||||||||||||||||||||||||
| Comment by Githook User [ 20/Feb/18 ] | |||||||||||||||||||||||||||
|
Author: {'email': 'kevin.albertson@10gen.com', 'name': 'Kevin Albertson', 'username': 'kevinAlbs'}Message: | |||||||||||||||||||||||||||
| Comment by Githook User [ 20/Feb/18 ] | |||||||||||||||||||||||||||
|
Author: {'email': 'kevin.albertson@10gen.com', 'name': 'Kevin Albertson', 'username': 'kevinAlbs'}Message: | |||||||||||||||||||||||||||
| Comment by Githook User [ 08/Feb/18 ] | |||||||||||||||||||||||||||
|
Author: {'email': 'kevin.albertson@10gen.com', 'name': 'Kevin Albertson', 'username': 'kevinAlbs'}Message: | |||||||||||||||||||||||||||
| Comment by Githook User [ 06/Feb/18 ] | |||||||||||||||||||||||||||
|
Author: {'email': 'kevin.albertson@10gen.com', 'name': 'Kevin Albertson', 'username': 'kevinAlbs'}Message: | |||||||||||||||||||||||||||
| Comment by Githook User [ 22/Jan/18 ] | |||||||||||||||||||||||||||
|
Author: {'name': 'Kevin Albertson', 'email': 'kevin.albertson@10gen.com', 'username': 'kevinAlbs'}Message: | |||||||||||||||||||||||||||
| Comment by Githook User [ 10/Jan/18 ] | |||||||||||||||||||||||||||
|
Author: {'email': 'kevin.albertson@10gen.com', 'name': 'Kevin Albertson', 'username': 'kevinAlbs'}Message: | |||||||||||||||||||||||||||
| Comment by Judah Schvimer [ 12/Dec/17 ] | |||||||||||||||||||||||||||
|
ShardingTest calls stopSet so it should just happen automatically. In addition to adding a check to stopSet we should also add calls to stopSet in many of the tests that do not explicitly call it. | |||||||||||||||||||||||||||
| Comment by Kevin Albertson [ 12/Dec/17 ] | |||||||||||||||||||||||||||
|
After | |||||||||||||||||||||||||||
| Comment by Scott Hernandez (Inactive) [ 16/Aug/16 ] | |||||||||||||||||||||||||||
|
Also, this no longer a hook, but simply part of the replSetTest class. | |||||||||||||||||||||||||||
| Comment by Eric Milkie [ 16/Aug/16 ] | |||||||||||||||||||||||||||
|
There's probably a bunch of places where we don't want to check the hashes at shutdown time, because inconsistencies are expected. This could happen, for example, with a test with a partitioned set using mongobridge that never gets healed before shutdown. Therefore, there should be a way to disable the hook if necessary. |