[SERVER-29811] extremely slow reads from secondaries after drop of unused indexes Created: 23/Jun/17  Updated: 23/Jun/17  Resolved: 23/Jun/17

Status: Closed
Project: Core Server
Component/s: None
Affects Version/s: 3.2.12
Fix Version/s: None

Type: Bug Priority: Major - P3
Reporter: Kay Agahd Assignee: Unassigned
Resolution: Duplicate Votes: 0
Labels: None
Remaining Estimate: Not Specified
Time Spent: Not Specified
Original Estimate: Not Specified

Attachments: File diagnostic.data.tgz     File mongo-077.log.tgz     JPEG File responseTimesAfterIndexDrop.jpeg    
Issue Links:
Duplicate
duplicates WT-3207 Drops with checkpoint_wait=false shou... Closed
duplicates WT-3362 Cursor opens should never block for t... Closed
Backwards Compatibility: Fully Compatible
Operating System: ALL
Participants:

 Description   

After having dropped some unused indexes through mongos, reads (using an index) from secondaries were more than 100 times slower than before (over 3000 ms). Reads from primaries were still fast as usual (1-5 ms). Since the issue persisted for over 24 hours, we restarted all secondaries, which resolved the issue immediately.
Please see attached a screenshot from our monitored response times. We dropped the unsused indexes at 09:30 and restarted the secondaries the next day at 12:00 o'clock. You can see that the response times went up dramatically during this period.

I attach also the log file from one of our secondaries, called mongo-077, where you see that queries on field "offerId" became very slow after dropping the unused indexes. Our slowMs value is set to 3000 ms (instead of the default 100 ms). This is why you won't find queries on field "offerId" in the log before we dropped the unused indexes because they were always faster than 3000 ms.

Mongodb version:

2017-06-22T06:38:11.431+0200 I CONTROL  [signalProcessingThread] pid=38407 port=27017 64-bit host=mongo-007
2017-06-22T06:38:11.431+0200 I CONTROL  [signalProcessingThread] db version v3.2.12
2017-06-22T06:38:11.431+0200 I CONTROL  [signalProcessingThread] git version: ef3e1bc78e997f0d9f22f45aeb1d8e3b6ac14a14
2017-06-22T06:38:11.431+0200 I CONTROL  [signalProcessingThread] OpenSSL version: OpenSSL 1.0.1t  3 May 2016
2017-06-22T06:38:11.431+0200 I CONTROL  [signalProcessingThread] allocator: tcmalloc
2017-06-22T06:38:11.431+0200 I CONTROL  [signalProcessingThread] modules: none
2017-06-22T06:38:11.431+0200 I CONTROL  [signalProcessingThread] build environment:
2017-06-22T06:38:11.432+0200 I CONTROL  [signalProcessingThread]     distmod: debian71
2017-06-22T06:38:11.432+0200 I CONTROL  [signalProcessingThread]     distarch: x86_64
2017-06-22T06:38:11.432+0200 I CONTROL  [signalProcessingThread]     target_arch: x86_64



 Comments   
Comment by Kay Agahd [ 23/Jun/17 ]

Thanks Bruce for sorting this out.

Comment by Bruce Lucas (Inactive) [ 23/Jun/17 ]

Thanks for uploading the data, and thanks for the bug report.

Based on the uploaded data, it appears you are encountering WT-3207 and/or WT-3362. When there is a drop operation, those issues can cause operations to block until the end of a checkpoint, and comparing the timing of the slow queries in the log with the timing of the checkpoints in the diagnostic data we see exactly that.

WT-3207 is fixed in 3.4.4, and WT-3362 is scheduled for 3.4.6. I'll close this ticket as a duplicate; if you encounter this issue again after upgrading to 3.4.6, please open a new ticket.

Thanks,
Bruce

Comment by Kay Agahd [ 23/Jun/17 ]

I just uploaded it here. It worked.

Comment by Kay Agahd [ 23/Jun/17 ]

Sure, but even compressed it is 97 MB. Where can I upload it? Thanks! Btw. I misspelled the server name, it's mongo-007 and not mongo-077 - but it shouldn't matter for the bug hunt.

Comment by Bruce Lucas (Inactive) [ 23/Jun/17 ]

In order for us to continue diagnosing this issue, can you please archive the $dbpath/diagnostic.data directory from the mongo-077 node and attach it to this ticket?

Thanks,
Bruce

Generated at Thu Feb 08 04:21:52 UTC 2024 using Jira 9.7.1#970001-sha1:2222b88b221c4928ef0de3161136cc90c8356a66.