[SERVER-55298] Reproduce and Investigate BSONObjectTooLarge error Created: 18/Mar/21 Updated: 29/Oct/23 Resolved: 31/Mar/21 |
|
| Status: | Closed |
| Project: | Core Server |
| Component/s: | None |
| Affects Version/s: | 4.4.4 |
| Fix Version/s: | 4.4.5, 5.0.0-rc0 |
| Type: | Task | Priority: | Critical - P2 |
| Reporter: | Lamont Nelson | Assignee: | Lamont Nelson |
| Resolution: | Fixed | Votes: | 0 |
| Labels: | bkp, sharding-wfbf-day | ||
| Remaining Estimate: | Not Specified | ||
| Time Spent: | Not Specified | ||
| Original Estimate: | Not Specified | ||
| Issue Links: |
|
||||||||||||
| Backwards Compatibility: | Fully Compatible | ||||||||||||
| Backport Requested: |
v5.0, v4.4
|
||||||||||||
| Sprint: | Sharding 2021-03-22, Sharding 2021-04-05 | ||||||||||||
| Participants: | |||||||||||||
| Description |
|
In version 4.4+, a cluster can get a BSONObjectTooLarge error when attempting to migrate chunks. The _migrateClone command that the recipient invokes has logic here and here to ensure that we don't return too big of a vector to be placed in the command. The particular error is below:
The goal of this ticket is to reproduce the error and determine the best way to mitigate the situation. The parameters internalQueryExecYieldIterations/internalQueryExecYieldPeriodMS have been suggested as a possible work around, but runtime effects of changing the parameters should be evaluated. |
| Comments |
| Comment by Githook User [ 31/Mar/21 ] | ||||||||||||||
|
Author: {'name': 'LaMont Nelson', 'email': 'lamont.nelson@mongodb.com', 'username': 'lamontnelson'}Message: (cherry picked from commit ab6654feeb39f96065e89d90bba9c2dd5acb0731) | ||||||||||||||
| Comment by Githook User [ 31/Mar/21 ] | ||||||||||||||
|
Author: {'name': 'LaMont Nelson', 'email': 'lamont.nelson@mongodb.com', 'username': 'lamontnelson'}Message: | ||||||||||||||
| Comment by Lamont Nelson [ 29/Mar/21 ] | ||||||||||||||
|
code review: https://mongodbcr.appspot.com/779940002/ | ||||||||||||||
| Comment by Lamont Nelson [ 23/Mar/21 ] | ||||||||||||||
|
Got it. That makes sense. | ||||||||||||||
| Comment by Max Hirschhorn [ 23/Mar/21 ] | ||||||||||||||
|
Thanks for confirming obj is an index key! Empty strings are used in the BSONObj representation of index keys because the field names are redundant with the {a: 1, b: 1, ...} key pattern specification. The ordering of the key-value pairs is instead used to define the correspondence between the index key object and the key pattern specification. There are functions like rehydrateIndexKey() and IndexKeyEntry::rehydrateKey() to transform the index key into something more human readable. (Those functions aren't as helpful here because chunk migration is wanting the full document anyway.) | ||||||||||||||
| Comment by Lamont Nelson [ 23/Mar/21 ] | ||||||||||||||
|
max.hirschhorn The object I'm seeing for obj is
I'm curious as to how one interprets this object? I thought that column names would be required. | ||||||||||||||
| Comment by Lamont Nelson [ 23/Mar/21 ] | ||||||||||||||
|
Yes I had thought this as well, and ended up correcting this statement in the help ticket. I have a patch that fixes the issue by finding the doc earlier, but I'll change it to use IXSCAN_FETCH since that's more efficient. | ||||||||||||||
| Comment by Max Hirschhorn [ 23/Mar/21 ] | ||||||||||||||
I couldn't imagine BSONObj::objsize() being incorrect given how widely used that method is. lamont.nelson, is it possible the 15 bytes for obj is correct and the issue is that we're using the size of the index key to estimate the size of the full document which is instead fetched with Collection::findDoc() later on in the loop? It looks like _jumboChunkCloneState->clonerExec is an IndexScan using the default options of IXSCAN_DEFAULT and therefore won't have access to the full document via PlanExecutor::getNext(). Changing it to use IXSCAN_FETCH and removing the Collection::findDoc() call might be a solution here. | ||||||||||||||
| Comment by Lamont Nelson [ 22/Mar/21 ] | ||||||||||||||
|
I made a workload that inserts documents near the 16MB limit (16MB - 5kb specifically). It then calls removeShard and waits for that operation to complete. I added the following log statement here:
This results in the following output:
We can see that objsize is reported as 15 constantly, so that the expression "arrBuilder->len() + obj.objsize() + 1024) > BSONObjMaxUserSize" will not have the intended effect. This allows two of the documents to be admitted into the array. I'm opening a ticket to add tests for bsonobj.objsize(). | ||||||||||||||
| Comment by Lamont Nelson [ 22/Mar/21 ] | ||||||||||||||
|
I'm seeing that the objsize method of BsonObj in the two links in the ticket description is underreporting the size for the next document, and this allows the array to grow beyond the defined limits as kaloian.manassiev had speculated. I'll post a more detailed message in a bit. |