[SERVER-40433] Mongodump extremely slow on 4.0.8 Created: 02/Apr/19  Updated: 20/Aug/19  Resolved: 20/Aug/19

Status: Closed
Project: Core Server
Component/s: None
Affects Version/s: 4.0.8
Fix Version/s: None

Type: Bug Priority: Major - P3
Reporter: Paul Henderson Assignee: Danny Hatcher (Inactive)
Resolution: Done Votes: 1
Labels: None
Remaining Estimate: Not Specified
Time Spent: Not Specified
Original Estimate: Not Specified

Attachments: PNG File 40433.png     PNG File SERVER-40433-3.6.6-vs-4.0.9-eviction.png     PNG File Screen Shot 2019-05-21 at 11.44.02 AM.png     PNG File Screen Shot 2019-05-21 at 11.44.10 AM.png     PNG File comparison.png     PNG File image-2019-06-20-17-14-07-249.png     PNG File server-40433-3.6.png     PNG File server-40433-4.09-metrics.png     PNG File server-40433-lock-issue.png    
Issue Links:
Problem/Incident
is caused by WT-4194 Improve fairness of eviction with mul... Closed
is caused by WT-4869 Stop adding cache pressure when evict... Closed
Related
related to WT-4758 Create a workload that bottlenecks on... Closed
related to WT-4878 Disable random dhandle selection and ... Closed
Operating System: ALL
Sprint: Storage Engines 2019-07-01
Participants:
Story Points: 3

 Description   

Since upgrading to mongo 4.0.8 from 3.4.6 our mongodumps have become extremely slow.

mongodump --archive=${DIR}.gz \
-h ${MONGO_REPL_SET_NAME}/${MONGO_SET} \
-u ${MONGO_USER} \
-p ${MONGO_PWD} \
--gzip \
--readPreference secondaryPreferred \
--ssl \
--sslCAFile mongodb-ca.crt \
--authenticationDatabase admin \
--${DB_OR_OPLOG} \
--forceTableScan 

is the command we run for backups. Since the upgrade we are seeing about 10x slower backups.
Tried moving to a new machine with more memory and didn't see any speed improvement. CPU has 1 core at 100% while the rest of the cores have little to no activity. iostat doesn't show disk to be anywhere near saturated. We have tried running mongodump with the 3.6 and 4.0 cli too without any improvement. 



 Comments   
Comment by Danny Hatcher (Inactive) [ 20/Aug/19 ]

Ah I apologize for the confusion! Thanks for letting us know and for your patience while we fixed the problem.

Comment by Devon Yang [ 19/Aug/19 ]

Sorry if it wasn't clear, I can confirm the issue is resolved for us in 4.0.12 on behalf of Paul.

Comment by Danny Hatcher (Inactive) [ 19/Aug/19 ]

paulblend, as Devon mentioned, we implemented the fix we were waiting for in 4.0.12. Could you please try upgrading a Secondary and see if performance is improved?

devony, I can't speak to SERVER-40773 as I haven't reviewed that case in-depth. However, we do see value in removing idle cursors by default especially since they have to be idle for 28 hours before we attempt to clean them up. It is possible that a different setting would be better for your specific use-case but we would have to investigate which is outside the scope of this particular ticket.

Comment by Devon Yang [ 19/Aug/19 ]

Just wanted to let you know this is resolved in 4.0.12.

 

I have a question related to eviction that maybe can be answered quickly here. Is there a reason that the WT file_manager close_idle_time is still set to 100000 despite https://jira.mongodb.org/browse/WT-1930 having merged to support a 0 value. (https://jira.mongodb.org/browse/SERVER-18286). In https://jira.mongodb.org/browse/SERVER-40773 we were seeing large dhandle sweeps at this timeout which we think might be mitigated if the 0 setting is safe to use.

Comment by Danny Hatcher (Inactive) [ 26/Jun/19 ]

paulblend, I apologize for the confusion surrounding the status of this ticket. The workload that you've shown is concerning and I'd like to ensure that we've taken concrete steps to address it before closing this ticket. We've recently made some changes to the WiredTiger storage engine that may help alleviate the problem. I'm following up internally to see about getting those changes backported to a 4.0 build so that you can see if they help your issue.

Comment by Haribabu Kommi [ 24/Jun/19 ]

The problem is happening both before and after the mongodump. Due to the read heavy load, the cache eviction server is not able to clean the cache, and it leads to stalling. 

Comment by Paul Henderson [ 20/Jun/19 ]

The mongo dataase has 271 dbs in it, all of which are hurt by the issue. I took db.stats() from 2 larger ones and 2 smaller ones.

Comment by Dmitry Agranat [ 20/Jun/19 ]

Hi devony,

Could you please provide the output of dbStats() for the relevant database? This will provide us some more information of how much data the mongodump is trying to access.

Thank you,
Dima

Comment by Dmitry Agranat [ 30/May/19 ]

Hi devony, thank you for uploading new data, we will analyze it and report back with next steps.

In the meantime, regarding your latest question:

I want to also double check that there are no known side-effects to running a primary and replica as a 4.0.9 while there is also secondary on 3.6.6. Obviously we will not set compatibility mode to 4.0 until this is resolved.

As long as this not a sharded cluster and compatibility mode was not set to 4.0, there should not be any side effect with having one node on 3.6 version until we sort this out.

Thank you,
Dima

Comment by Devon Yang [ 30/May/19 ]

Hi @Dima,

I've uploaded the relevant diagnostics files through the secure portal. The data from the affected timeframe should be in metrics.2019-05-29T18-55-39Z-00000

I want to also double check that there are no known side-effects to running a primary and replica as a 4.0.9 while there is also secondary on 3.6.6. Obviously we will not set compatibility mode to 4.0 until this is resolved.

Comment by Dmitry Agranat [ 30/May/19 ]

Hi paulblend,

Thank you for the update. Could you please upload the diagnostic data covering the latest test you did after upgrading your node to 3.6.10? This information would help us to determine the next steps.

Leaving one node on 3.6 until we'll sort this out should not be an issue.

Thank you,
Dima

Comment by Paul Henderson [ 29/May/19 ]

Would there be any problems upgrading 2 nodes to 4.0.x but leaving one as 3.6.x while the backup issue is sorted out?

 

As for the above, I went with option 1 kind of. I upgraded our 3.6.6 machine to 3.6.10. The backup problem came back. I downgraded the server back to 3.6.6. and the problem went away again.

Comment by Dmitry Agranat [ 28/May/19 ]

Hi paulblend,

We are still investigating this issue. Currently, one of the suspects is SERVER-40925 or/and the change that was made in SERVER-38779. In order to confirm this is indeed the case, or rule this out, there are two options we can try:

  • Option 1: Downgrade from 3.6.12 to 3.6.10 (since SERVER-38779 was introduced in 3.6.11) or
  • Option 2: Setting the wiredTigerSessionCloseIdleTimeSecs parameter to 0 on your current version, 3.6.12. The default for the parameter is 300 seconds (5 mins)

Please chose one of the above options, whichever is more convenient for you.

Once you have repeated the same workload after the proposed change, please upload the new set of requested data into the secure upload portal.

Thank you,
Dima

Comment by Paul Henderson [ 16/May/19 ]

sent

Comment by Danny Hatcher (Inactive) [ 16/May/19 ]

Just 4.0.9 should be sufficient.

Comment by Paul Henderson [ 16/May/19 ]

Havn't had a chance, will get to it today. Do you need the 3.6.12 versions again just the 4.0.9 ones?

Comment by Danny Hatcher (Inactive) [ 16/May/19 ]

Hello Paul,

Have you had a chance to send clean logs? Having the comparison would be extremely helpful.

Comment by Paul Henderson [ 09/May/19 ]

Actually disregard and delete those files if possible. I'll resend new logs in a bit. 

Comment by Paul Henderson [ 09/May/19 ]

Timestamps for when the backups were run against 4.0.9 were 10:45am-noon PST 5/9

Comment by Paul Henderson [ 09/May/19 ]

I uploaded backup logs and diagnostic data from 4.0.9 and reuploaded the 3.6.6. and 3.6.12 

Comment by Paul Henderson [ 08/May/19 ]

I'll get around to that sometime in the next day or so and upload the files. 

Comment by Danny Hatcher (Inactive) [ 08/May/19 ]

Paul,

That would be a big help!

Danny

Comment by Paul Henderson [ 08/May/19 ]

Do you need me to reupgrade a node to 4.0 to test the backups there and get the logs/diagnostics data again?

Comment by Sulabh Mahajan [ 07/May/19 ]

I had a quick look at the FTDC data from 3.6.12-bad. Here are my observations:

  • From B - C, everything seems to be completely stalled (all the query metrics are 0, R/W global locks are blocked, eviction server is not finding any page to evict and WiredTiger sees no activities at all)
  • The activity starts again, coinciding with obtaining table lock and schema lock inside WiredTiger.
  • Possibly the eviction server is blocked on these locks and could not walk dhandles to find eviction candidates.

Further investigation is needed to figure out what operation caused this 10-second stall. I could not locate FTDC from 4.0, it will be worth looking at that to make sure 4.0 suffered from the same issue.

Comment by Danny Hatcher (Inactive) [ 30/Apr/19 ]

Hello Paul,

It does appear that this is an issue with eviction but I'm not sure yet why one node is worse. The info you've provided should be sufficient for me to dig into for now.

Thanks,

Danny

Comment by Paul Henderson [ 25/Apr/19 ]

Having issues getting exact logs. The bad runs happened between 12:30-13:30 PST yesterday, the good runs happened post 13:30 PST. Having problems finding better time stamps. If you need them I can resend files from today, this time with a mongo log failing to provide better timestamps. 

Comment by Danny Hatcher (Inactive) [ 25/Apr/19 ]

Hello Paul,

Thanks for the logs. Do you have timestamps for your runs that I can use to match with the logs and diagnostics? The output you provided in the server40433* files was very helpful before and would be great to have the equivalent here.

Danny

Comment by Paul Henderson [ 23/Apr/19 ]

I'll rerun later today or tomorrow, right now 1 is on 3.6.12 and the other is on 3.6.6

 

Comment by Danny Hatcher (Inactive) [ 23/Apr/19 ]

Hello Paul,

Actually, if both of your secondaries are the same size that is an ideal comparison. What versions are they at this point? If one is pre-3.6.12 and one is after, could you run the tests again?

Danny

Comment by Paul Henderson [ 23/Apr/19 ]

Thanks for looking. We have had a couple wrinkles come up.

The only reason we have a secondary that is so much larger is because the issue was occurring when all the machines were the same size. Increasing the secondary size was a kinda hail mary which I guess might have made things worse. We have since then moved the large machine to be primary and have 2 same sized small machines as a secondary. I can decrease the wiredTigerCacheSize on the primary to test but more likely we will take a few days to upgrade the small machines to all be the same size as the larger one so all caches will be the same large value.

It turns out we also have this same issue when upgrading from 3.6.6. to 3.6.12, our backups run fine on 3.6.6 but have the same slowness issue on 3.6.12 that we were seeing on 4.0.8.

Comment by Danny Hatcher (Inactive) [ 22/Apr/19 ]

Hello Paul,

Apologies for the delay. Looking into the situation for your 4.0 test, I can see that within 3 minutes of the mongodump starting your WiredTiger cache reached 95% capacity. When the cache takes up that much space, it starts to use application threads that would otherwise be servicing queries to help evict. It dramatically slows down your server which is why the 4.0 mongodump takes so long.

The question is why did your cache rise from 80% (normal) to 95% (very high) so quickly on 4.0 but not on 3.6. I see that your 4.0 server is configured with twice the amount of RAM as compared to your 3.6 server thus twice the WiredTiger cache size. We have seen in the past issues with eviction when the available cache for a node is very high.

Would it be possible for you to set the wiredTigerCacheSize parameter for the larger secondary down to match the default value (61GB) of the other secondary? This may negatively impact the performance of your cluster if you are running a large amount of secondary reads so you should do this during a maintenance period or one of low load.

Thanks,

Danny

Comment by Paul Henderson [ 17/Apr/19 ]

Any updates?

Comment by Paul Henderson [ 05/Apr/19 ]

Let me know if there is anything else I can provide that would be helpful in debugging. 

Comment by Paul Henderson [ 02/Apr/19 ]

Thanks for looking. I uploaded the metrics files from diagnostic.data and logs for mongodb3 and mongodb4. I prefixed the files with server40433.

Comment by Danny Hatcher (Inactive) [ 02/Apr/19 ]

Hello Paul,

Thanks for your report. In order for us to diagnose, please attempt the mongodump against both the 4.0 and the 3.6 instances and provide the resulting mongod logs plus "diagnostic.data" folders under the {{$dbpath}}s. You can upload the files to our Secure Upload Portal. Please rest assured that only MongoDB engineers will be able to see the files and they will delete automatically after 90 days.

Thanks,

Danny

Comment by Paul Henderson [ 02/Apr/19 ]

Another note, only one of our mongo instances is on 4.0 whereas the others are on 3.6. When we point the backups to instance on 4.0 they run slow, repointing them to a secondary on 3.6 and they speed back up. 

Generated at Thu Feb 08 04:54:57 UTC 2024 using Jira 9.7.1#970001-sha1:2222b88b221c4928ef0de3161136cc90c8356a66.