[SERVER-72028] E11000 duplicate key error collection: <col name> index: _id_ dup key: { _id: "xxxxxx 2022-12-10" }', details={}} Created: 11/Dec/22  Updated: 19/Dec/22  Resolved: 19/Dec/22

Status: Closed
Project: Core Server
Component/s: None
Affects Version/s: 5.0.14
Fix Version/s: None

Type: Bug Priority: Major - P3
Reporter: Witold Kupś Assignee: Yuan Fang
Resolution: Done Votes: 0
Labels: Bug
Remaining Estimate: Not Specified
Time Spent: Not Specified
Original Estimate: Not Specified

Issue Links:
Related
related to SERVER-14322 Retry on predicate unique index viola... Closed
Operating System: ALL
Participants:

 Description   

Hello,
I have a spring boot application (`2.7.3`), which is using the reactive mongodb driver. The database is `5.0.11-focal` (docker image).
The problem is that when I execute a query created like below (kotlin)

 

fun addRequests(requests: List<RequestCountReport>) =
    template.getCollection(REQUEST_COUNT_COLLECTION)
        .flatMapMany { c ->
            val updates = requests.map { r ->
                val time = r.time.date()
                UpdateOneModel<Document>(
                    BasicDBObject(
                        mapOf(
                            "_id" to r.scope + " " + time,
                            "scope" to r.scope,
                            "time" to time,
                        )
                    ),
                    BasicDBObject(
                        mapOf(
                            "\$inc" to BasicBSONObject(
                                r.requests.mapKeys { "requests.${it.key}" }
                            )
                        )
                    ),
                    UpdateOptions().upsert(true)
                )
            }
            c.bulkWrite(updates).toFlux()
        }
        .then() 

 

(`RequestCountReport` has the following structure)

data class RequestCountReport(
    val scope: String,
    val time: Temporal,
    val requests: Map<String, Int>,
) 

 

...which is translated to the following query to mongo

 

{
  "update": "requestCount",
  "ordered": true,
  "txnNumber": 3,
  "$db": "route",
  "$clusterTime": {
    "clusterTime": {
      "$timestamp": {
        "t": 1670768591,
        "i": 1
      }
    },
    "signature": {
      "hash": {
        "$binary": {
          "base64": "AAAAAAAAAAAAAAAAAAAAAAAAAAA=",
          "subType": "00"
        }
      },
      "keyId": 0
    }
  },
  "lsid": {
    "id": {
      "$binary": {
        "base64": "OdYdXMkcQ+CxD2BbLWRsog==",
        "subType": "04"
      }
    }
  },
  "updates": [
    {
      "q": {
        "_id": "admin 2022-12-11",
        "scope": "admin",
        "time": {
          "$date": "2022-12-11T00:00:00Z"
        }
      },
      "u": {
        "$inc": {
          "requests.here,maptile,road,truck,fleet": 187
        }
      },
      "upsert": true
    }
  ]
}

 

it sometimes gives an error like this

Write errors: [BulkWriteError{index=0, code=11000, message='E11000 duplicate key error collection: route.requestCount index: _id_ dup key: { _id: "xxxxxx 2022-12-10" }', details={}}]. 
    at com.mongodb.internal.connection.BulkWriteBatchCombiner.getError(BulkWriteBatchCombiner.java:167) ~[mongodb-driver-core-4.6.1.jar:na] 

I initially had a single op write (one for each entry) and it also occurred, but then I could at least retry the given entry write. Now, when it is a bulk, I am not sure even how to do it (something may be saved, it is not in the transaction I assume). Nevertheless, it is a bug IMO



 Comments   
Comment by Yuan Fang [ 19/Dec/22 ]

Hi witkups@gmail.com,

Thank you for reporting this issue. My investigation leads me to believe that this likely happens when two updates come in with upsert:true, and both can't find a match, resulting in both attempting to insert new documents which conflict on unique index violations of the query predicate. 

I initially had a single op write (one for each entry) and it also occurred, but then I could at least retry the given entry write. Now, when it is a bulk, I am not sure even how to do it

While the server can automatically retry upserts on DuplicateKey error in some circumstances, it cannot in all circumstances, SERVER-14322 includes examples where the server cannot automatically retry.  The example shown in the second to the last line in the table indicates that the server won't retry on DuplicateKey error if the query predicate includes more fields than just the index key field (i.e. _id) which throws the duplicate key error. 

Even though this seems to be an expected behavior, from a user perspective, I understand the need to handing partial failures in bulk updates, it is worth considering doing retries on the application end. For a solution that suits your use case, we'd like to encourage you to discuss your use case on the MongoDB Developer Community Forums. If the discussion there leads you to suspect a bug in the MongoDB server, please revisit this ticket with more information, then we will reopen the ticket and investigate it.

Regards,
Yuan

Generated at Thu Feb 08 06:20:37 UTC 2024 using Jira 9.7.1#970001-sha1:2222b88b221c4928ef0de3161136cc90c8356a66.