[SERVER-72978] Random high query response and very high CPU on running rs combination with 4.2.20 and 4.4.18 versions Created: 18/Jan/23 Updated: 17/Mar/23 |
|
| Status: | Investigating |
| Project: | Core Server |
| Component/s: | Performance, Querying, WiredTiger |
| Affects Version/s: | 4.2.20, 4.4.18 |
| Fix Version/s: | None |
| Type: | Bug | Priority: | Major - P3 |
| Reporter: | KAPIL GUPTA | Assignee: | Chris Kelly |
| Resolution: | Unresolved | Votes: | 25 |
| Labels: | Bug | ||
| Remaining Estimate: | Not Specified | ||
| Time Spent: | Not Specified | ||
| Original Estimate: | Not Specified | ||
| Environment: |
java client driver on client side - 3.12.9 |
||
| Issue Links: |
|
||||||||
| Operating System: | ALL | ||||||||
| Steps To Reproduce: | 1- run 2 data members with 4.2.20 and 2 data members with 4.4.18 2- put medium to high load(read/writes) on rs and observe |
||||||||
| Participants: | |||||||||
| Description |
|
Scenario - We run tests with same load on 2 mongo versions(4.2.20,4.4.20).
Note: Please provide cloud link to upload respective logs. |
| Comments |
| Comment by KAPIL GUPTA [ 17/Mar/23 ] |
|
Gentle Reminder 2!
Regards, Kapil |
| Comment by KAPIL GUPTA [ 03/Mar/23 ] |
|
Hi Chris, Gentle Reminder! Did you get anything on this?
Regards, Kapil |
| Comment by KAPIL GUPTA [ 23/Feb/23 ] |
|
Hi Chris, I have uploaded the logs. Please find the logs file detail given below. Parent file name: Logs.zip child files: cps@vpas-B-persistence-db-9:~$ tar -tf PrimaryLogs.tar.gz NOTE: high cpu is seen multiple times(almost every 2 minutes) from 22.02.23 12:45:00 to 22.02.23 13:45:00 (mentioned in mongo-27040.log and mongo-27040.log.1) and also I enabled log verbosity from 2023-02-22T13:09:21.295+0000 (mentioned in mongo-27040.log_verbose ) to get more logs cps@vpas-B-persistence-db-10:~$ tar -tf Secondary4.2Logs.tar.gz cps@vpas-A-persistence-db-10:~$ sudo tar -tf Secondary4.4Logs.tar.gz cps@vpas-A-persistence-db-9:~$ sudo tar -tf Secondary_2_4.4Logs.tar.gz Thanks, Kapil |
| Comment by Chris Kelly [ 22/Feb/23 ] |
|
Hi Kapil, I've created a new secure upload portal link for you. Christopher |
| Comment by KAPIL GUPTA [ 22/Feb/23 ] |
|
Hi Chris, Sorry for delay as we lost our last logs. we recreated the issue now but unfortunately the upload link has been expired. As per your suggestion, we checked that feasibility( have lower version secondaries)as well but here scenario is like -[1 primary and 1 secondary on 4.2 and 2 secondaries on 4.4]also we tested same with ( Each time we got high cpu issue on 4.2 primary member only. Kindly provide the new upload link to upload the logs.
Thanks,
|
| Comment by Chris Kelly [ 08/Feb/23 ] |
|
We still need additional information to diagnose the problem. If this is still an issue for you, would you please supply the requested information if possible? Also, just as an FYI, it seems like you are pointing out that you are observing latency on a higher version primary node, when you have lower version secondaries. Per the upgrade steps, you should be upgrading these secondaries first to the higher version, and then finishing your upgrade by stepping down the primary (which should be the last one to be upgraded) Effectively, this sounds like it should mitigate your problem. If it does not, please provide further detail regarding this process (and the requested information above). Thanks! |
| Comment by Chris Kelly [ 19/Jan/23 ] |
|
Hi Kapil, I've created a secure upload portal for you. Files uploaded to this portal are hosted on Box, are visible only to MongoDB employees, and are routinely deleted after some time. For each node in the replica set spanning a time period that includes the incident, would you please archive (tar or zip) and upload to that link:
Additionally, if you have a test driver that reproduces the workload you're having issues with, that would be significantly helpful in pinning down what may be occurring here. Christopher |