[SERVER-29282] BSON Document Size can be exceeded when grouping inserts on SECONDARY nodes Created: 18/May/17 Updated: 30/Oct/23 Resolved: 13/Jul/17 |
|
| Status: | Closed |
| Project: | Core Server |
| Component/s: | Replication |
| Affects Version/s: | None |
| Fix Version/s: | 3.4.7, 3.5.11 |
| Type: | Bug | Priority: | Major - P3 |
| Reporter: | William Schultz (Inactive) | Assignee: | William Schultz (Inactive) |
| Resolution: | Fixed | Votes: | 0 |
| Labels: | None | ||
| Remaining Estimate: | Not Specified | ||
| Time Spent: | Not Specified | ||
| Original Estimate: | Not Specified | ||
| Attachments: |
|
||||||||||||||||
| Issue Links: |
|
||||||||||||||||
| Backwards Compatibility: | Fully Compatible | ||||||||||||||||
| Backport Requested: |
v3.4
|
||||||||||||||||
| Sprint: | Repl 2017-06-19, Repl 2017-07-31 | ||||||||||||||||
| Participants: | |||||||||||||||||
| Case: | (copied to CRM) | ||||||||||||||||
| Description |
|
During steady state replication, due to the way that SECONDARY nodes group 'insert' oplog entries, the BSON maximum document size of 16MB may be exceeded if we try to group together a batch that include documents at or near the 16MB limit. The predicate for how we delimit a group cuts off the group at the first op that exceeds the group size limit, but when we put these ops into a group here, it could put us over a maximum document size when we try to construct a BSON object here. This is due to the incorrect way that we measure the size of a batch. Given a list of ops, we start iterating from the second op, and initialize the batchSize to zero. Thus, we don't actually include the size of the first op in our measurement of the batch size, which is how we could be allowed to exceed document size limits when our ops are very close to 16MB. The logic in place to handle insert grouping is somewhat hard to follow as is. This bug can be addressed by limiting batch sizes correctly by including the sizes of all ops, and making sure we delimit the group correctly based on this. Ideally, the grouping logic can be cleaned up at the same time to make it more clear that is correct. A JS test reproducing this issue is attached. It reproduced the issue reliably running on MongoDB 3.5.7. |
| Comments |
| Comment by Githook User [ 31/Jul/17 ] |
|
Author: {'email': 'william.schultz@mongodb.com', 'username': 'will62794', 'name': 'William Schultz'}Message: (cherry picked from commit 1da62c11258aaa91dcff3f0133775aae615e29d4) |
| Comment by Githook User [ 13/Jul/17 ] |
|
Author: {u'username': u'will62794', u'name': u'William Schultz', u'email': u'william.schultz@mongodb.com'}Message: |
| Comment by Spencer Brody (Inactive) [ 22/May/17 ] |
|
If this is a regression in 3.4 then we probably want to try to fix this soon and get it backported. |
| Comment by William Schultz (Inactive) [ 19/May/17 ] |
|
Yes, I was able to reproduce this on 3.4.4. Not 3.2 or 3.0 though. This looks like the commit that introduced the issue: https://github.com/mongodb/mongo/commit/2e153f35f45e284d066210792b7b231b033baaa8 |
| Comment by Andy Schwerin [ 19/May/17 ] |
|
Does this affect any of the stable branches? |