[SERVER-47408] oplog documents from transactions can breach the maximum bson size and break mongorestore Created: 08/Apr/20 Updated: 22/Sep/23 Resolved: 21/Apr/20 |
|
| Status: | Closed |
| Project: | Core Server |
| Component/s: | Replication, Tools |
| Affects Version/s: | 4.2.5 |
| Fix Version/s: | None |
| Type: | Bug | Priority: | Major - P3 |
| Reporter: | George Field | Assignee: | Unassigned |
| Resolution: | Duplicate | Votes: | 0 |
| Labels: | backup, mongod, oplog, replication, transactions | ||
| Remaining Estimate: | Not Specified | ||
| Time Spent: | Not Specified | ||
| Original Estimate: | Not Specified | ||
| Environment: |
OS: CentOS Linux release 7.7.1908 (Core) Also affects the mongo:4.2.5 docker image |
||
| Attachments: |
|
||||||||||||||||
| Issue Links: |
|
||||||||||||||||
| Assigned Teams: |
Tools
|
||||||||||||||||
| Operating System: | ALL | ||||||||||||||||
| Steps To Reproduce: | Reproducing the large oplog entries is straightforward, simply make a number of very large writes in a transaction. It's easier to demonstrate using monogdump, steps to reproduce with mongodump follow.
Capturing the error with mongodump relies on timing, so depending on hardware the test case might not hit the issue, although it's working reliably on my machine. As per the issue description, this is affecting MongoDB 4.2.5, installed on CentOS 7 via the official mongodb-org-4.2 RPM repo, but I've also been able to reproduce using the mongo:4.2.5 docker image.
To reproduce, it's necessary to write a large transaction while mongodump is dumping that database. To reproduce this, I've written a small go command to write a large amount of dummy data to a collection (in order to slow down the mongodump operation), and another go command to perform a short transaction that writes a few large documents. Steps:
Some code to help reproduce can be found here, tested on Go 1.14: https://github.com/ks07/mongodb-oplog-bug A full log showing usage in reproducing the error and the error output: https://gist.github.com/ks07/869229ea0fd26c6b0058f81427691070
Note that the example writes a small number of large documents very close to the 16 MiB limit, but in production usage we have noticed this behaviour with transactions writing a large number of smaller documents. |
||||||||||||||||
| Participants: | |||||||||||||||||
| Description |
|
I've run into an issue where point-in-time snapshots of a mongo server produced using mongodump --oplog can be unusable, if they happen to coincide with a large transaction write operation. In these cases, the oplog.bson produced in the dump will contain a document that exceeds the 16MiB size limit set in mongo-tools-common, and thus restoring with mongorestore --oplogReplay will fail.
# mongorestore --oplogReplay
From looking at the underlying local.oplog.rs collection, I think this may be a problem with how mongod attempts to split transactions into documents when writing to the oplog. By querying the local.oplog.rs collection it's possible to see the offending BSON documents in the oplog. Interestingly, despite this issue, it appears that replication still works correctly, although I haven't tested enough to confidently say this is the case.
I have attached one of the offending oplog.bson dumps to this issue. |
| Comments |
| Comment by Jessica Sigafoos [ 21/Apr/20 ] |
|
Please note that this work is now being tracked in Thank you! |
| Comment by Carl Champain (Inactive) [ 09/Apr/20 ] |
|
Thank you for taking the time to submit this report! Kind regards, |