-
Type: Improvement
-
Resolution: Unresolved
-
Priority: Unknown
-
None
-
Component/s: CRUD
-
None
-
Needed
Summary
The CRUD specifications do not clearly define how to batch documents for insert, delete, and update operations to handle cases where a document exceeds the maximum size of ~16MB in a non-initial batch, particularly when advancing past the first partition of the batch.
For example, when batching for an insert, the Go Driver partitions documents into ~16MB parts (this is a bug in the Go Driver, it should be 48 MB): p1, p2, p3, ... . If a part pi (where i > 1) fails with ErrDocumentTooLarge, then the operation terminates. In this case, partitions 1 through i - 1 would be inserted successfully, but the user would receive an error for partition i. In this case, should we still return a set of inserted IDs? Is this behavior generally acceptable?
Here is a repro using the Go Driver: https://gist.github.com/prestonvasquez/e28544554a872f4115389f1909f18ccd
Output:
2024/10/30 10:41:51 err: an inserted document is too large found: 100
Motivation
Who is the affected end user?
Any drivers that validate CRUD batches as they are advanced for operation.
How does this affect the end user?
In the described scenario, users could insert a subset of the intended data before receiving an error noting that they are trying to insert a document that is too large.
How likely is it that this problem or use case will occur?
Unknown but definitely possible
If the problem does occur, what are the consequences and how severe are they?
If analogues to ErrDocumentTooLarge are not associated with a list of inserted IDs, such as the Go Driver behavior, it can be quite annoying to avoid duplicating data when re-attempting the write.
Is this issue urgent?
NA
Is this ticket required by a downstream team?
No
Is this ticket only for tests?
No
Acceptance Criteria
- Mention that we should not attempt to validate document or message size when batching for insert/update/delete.
- Add a prose test that validates this behavior
- related to
-
GODRIVER-3378 Partial write failure using InsertMany results in missing inserted ids
- Backlog
-
DRIVERS-3036 Remove the BSON document size validation requirement for the client bulk write operation
- Implementing