-
Type: Bug
-
Resolution: Fixed
-
Priority: Major - P3
-
Affects Version/s: None
-
Component/s: None
-
None
We are hitting the default 90 mins timer for no output. The test is configured to run for 2 hours after the load phase and doesn't output anything for that long. Here is the local run that I did, with the timing call:
[info ] [genny.curator ] Moved existing metrics (presumably from a prior run). cwd=/mnt/data0/sulabh/work/genny existing=build/CedarMetrics moved_to=build/CedarMetrics-2021-04-19T030342Z-0f94120c timestamp=2021-04-19T03:03:42Z [info ] [genny.curator ] Starting poplar grpc in the background. command=['/mnt/data0/sulabh/work/genny/build/curator/curator', 'poplar', 'grpc'] cwd=/mnt/data0/sulabh/work/genny timestamp=2021-04-19T03:03:42Z [curator] 2021/04/19 13:03:42 [p=info]: starting poplar gRPC service at 'localhost:2288' [2021-04-19 13:03:43.177893] [0x00007f8ea4988e80] [info] Constructing pool with MongoURI 'mongodb://localhost:27017/?appName=Genny&maxPoolSize=5000' [2021-04-19 13:06:56.739428] [0x00007f8bb8c10700] [info] Done with load phase. All documents loaded [2021-04-19 13:06:58.860785] [0x00007f8bb640b700] [info] Done with load phase. All documents loaded [2021-04-19 13:07:00.019287] [0x00007f8bb840f700] [info] Done with load phase. All documents loaded [2021-04-19 13:07:00.082431] [0x00007f8bb740d700] [info] Done with load phase. All documents loaded [2021-04-19 13:07:00.086791] [0x00007f8bb5c0a700] [info] Done with load phase. All documents loaded [2021-04-19 13:07:01.033338] [0x00007f8bb6c0c700] [info] Done with load phase. All documents loaded [2021-04-19 13:07:01.034543] [0x00007f8bba413700] [info] Done with load phase. All documents loaded [2021-04-19 13:07:01.035044] [0x00007f8bb9411700] [info] Done with load phase. All documents loaded [2021-04-19 13:07:01.558331] [0x00007f8bb7c0e700] [info] Done with load phase. All documents loaded [2021-04-19 13:07:07.537337] [0x00007f8bb9c12700] [info] Done with load phase. All documents loaded [curator] 2021/04/19 15:30:59 [p=info]: poplar rpc service terminated real 147m17.612s user 384m15.664s sys 134m51.575s
The load phase ran from 13:03:42 to 13:07:07, ie 3.5 mins (205 secs). Then for roughly 147 - 3.5 = 143.5 mins there was no output as the test ran. The run time seems correct to me, for a configured 2 hours, plus some more time to finish the last scanner.
Either we should force the test to be verbose and output some messages during the test run, or change the no-message timer to say 3 hours to be safe.
cc. alex.cameron