[SERVER-32358] Rollback can used unbounded memory Created: 15/Dec/17  Updated: 06/Dec/22  Resolved: 15/Dec/17

Status: Closed
Project: Core Server
Component/s: None
Affects Version/s: 3.4.10
Fix Version/s: None

Type: Bug Priority: Major - P3
Reporter: Bruce Lucas (Inactive) Assignee: Backlog - Replication Team
Resolution: Won't Fix Votes: 0
Labels: None
Remaining Estimate: Not Specified
Time Spent: Not Specified
Original Estimate: Not Specified

Issue Links:
Related
is related to SERVER-23392 Increase Replication Rollback (Data) ... Closed
Assigned Teams:
Replication
Operating System: ALL
Participants:

 Description   

During rollback as we scan backwards in our oplog finding entries that must be rolled back, we save in memory copies of the the entire oplog entry here and here, if I'm reading the code correctly. If there is a lot of oplog to roll back this could be a lot of memory, resulting in OOM.

This code appears to have completely changed in 3.6, and I think it may not have the same problem. If so we can mark this issue fixed in 3.6 and/or dup it to the ticket that fixed the issue, using this ticket to track the issue and the fix.



 Comments   
Comment by Bruce Lucas (Inactive) [ 31/Dec/17 ]

The issue reported in this ticket is not related to memory usage by WT, but rather is memory usage by the rollback algorithm outside the WT cache. Cache pressure related to lagging majority commit point could cause performance issues, but should not cause excessive memory usage and OOM, which is the issue of concern here.

Comment by Andy Schwerin [ 31/Dec/17 ]

The WT resource usage increases as the node's local commit point gets far ahead of the majority commit point in 3.6 as well as 3.8.

Recover to timestamp is indeed a 3.8 feature.

Comment by Spencer Brody (Inactive) [ 15/Dec/17 ]

In 3.8, the further ahead of the replication majority commit point a node gets, the more WiredTiger cache pressure there will be, so before you get a large rollback you'll likely already be having problems. Outside of that, however, there shouldn't be any part of the new rollback that scales memory usage relative to the size of the rollback.

Comment by Bruce Lucas (Inactive) [ 15/Dec/17 ]

spencer how much better, given Judah's comment above? Is there still a possibility of OOM? If so shouldn't we bound the amount of memory used and abort rather than OOM in order to provide a useful error message?

Comment by Spencer Brody (Inactive) [ 15/Dec/17 ]

I don't think we're going to do anything here for our existing rollback algorithms. The situation should be better in the new Recover to Timestamp based rollback algorithm in 3.6.

Comment by Judah Schvimer [ 15/Dec/17 ]

We do constrain the amount of data here. I guess that constraint does not include the documents getting rolled back, just the refetched documents.

Comment by Bruce Lucas (Inactive) [ 15/Dec/17 ]

Since in general we place a couple restrictions on the "amount" of rollback we do, it might be ok to track how much memory we are using and abort the rollback when we use "too much" (could be hard-coded limit, could be computed from allocated vs physical memory). This would be better than OOM because it would make the cause more easily diagnosable and could point to the workaround of doing an initial sync.

Comment by Judah Schvimer [ 15/Dec/17 ]

I believe this is still a problem here. It would be fairly straightforward to persist these documents to disk. We probably could also only save the _id and the sync source's copy of the document once we refetch it. These were all considered out of scope for the "Safe Rollback For Non-WT Storage Engine" project. I don't think this ticket has a duplicate, so I'll leave it open for now.

Generated at Thu Feb 08 04:30:00 UTC 2024 using Jira 9.7.1#970001-sha1:2222b88b221c4928ef0de3161136cc90c8356a66.