Summary
Writing BSON documents with duplicate field names to MongoDB causes undefined behavior when the server has to interpret those documents (e.g. indexing, agg pipelines, etc). It can also cause issues when different drivers are writing and querying data, especially where some natively support representing duplicate field names and some do not (e.g. writing with Go, reading with Compass). Writing BSON documents with duplicate field names effectively causes data corruption.
Previous tickets have attempted to address this issue (see SERVER-6439, DRIVERS-612), but the problem persists according to stakeholders on the server teams (henrik.edin@mongodb.com et al). We should update the CRUD spec to require that write methods (insert, bulkWrite) return an error if given a document with duplicate field names. Drivers that use a native document type that already prevents duplicate field names (e.g. node, python) do not need to add additional checks.
Motivation
Who is the affected end user?
Users who build applications in languages that support writing documents with duplicate field names (e.g. Go, C).
How does this affect the end user?
They are confused about why their queries are sometimes not correct. They may initially suspect data corruption.
How likely is it that this problem or use case will occur?
The problem is usually result of a programming error. There is not a good way to detect the problem when it happens, so the programming errors can go unnoticed for an indefinite period of time, until someone notices the incorrect query results. The problem usually only occurs for apps written in Go or C.
If the problem does occur, what are the consequences and how severe are they?
Aggregation and indexed queries may return inconsistent or intermittently different results. The correctness of results cannot be trusted.
Is this issue urgent?
No.
Is this ticket required by a downstream team?
No.
Is this ticket only for tests?
No.
Acceptance Criteria
- Update the CRUD spec to require that write methods (insert, bulkWrite) return an error if given a document with duplicate field names.
- Add a prose test to check that attempting to write a document with duplicate field names returns an error.
- Drivers that use a native document type that already prevents duplicate field names (e.g. node, python) do not need to add additional checks or implement the prose test.
- is related to
-
DRIVERS-612 Add documentation warning against the use of duplicate key names
-
- Implementing
-
-
SERVER-6439 Duplicate fields at the same level should not be allowed
-
- Backlog
-
-
CDRIVER-6155 Improve API to help customers avoiding duplicate key names
-
- Backlog
-
-
GODRIVER-3703 Improve API to help customers avoiding duplicate field names in documents
-
- Investigating
-