Uploaded image for project: 'Core Server'
  1. Core Server
  2. SERVER-81246

FLE WriteConcernError behavior unclear

    • Type: Icon: Bug Bug
    • Resolution: Fixed
    • Priority: Icon: Major - P3 Major - P3
    • 7.3.0-rc0, 7.0.6
    • Affects Version/s: None
    • Component/s: None
    • Labels:
      None
    • Server Security
    • Fully Compatible
    • ALL
    • v7.0
    • Security 2023-11-13, Security 2023-11-27, Security 2023-12-11, Security 2023-12-25, Security 2024-01-08

      This seems to happen on both mongos and mongod.

      Here is an insert (non-FLE) that encounters a WriteConcernError (WCE). Note that n: 1 because the write went through, and the WCE is reported in the writeConcernError field.

      {
      	"n" : 1,
      	"writeConcernError" : {
      		"code" : 100,
      		"codeName" : "UnsatisfiableWriteConcern",
      		"errmsg" : "UnsatisfiableWriteConcern: Not enough data-bearing nodes; Error details: { writeConcern: { w: 3, wtimeout: 0, provenance: \"clientSupplied\" } } at shard-rs0",
      		"errInfo" : {
      			
      		}
      	},
      	"ok" : 1,
      	"$clusterTime" : {
      		"clusterTime" : Timestamp(1695160343, 1),
      		"signature" : {
      			"hash" : BinData(0,"AAAAAAAAAAAAAAAAAAAAAAAAAAA="),
      			"keyId" : NumberLong(0)
      		}
      	},
      	"operationTime" : Timestamp(1695160343, 1)
      }
      

      When using FLE, the WCE is placed into the writeErrors field. I'm not sure how drivers would then interpret the error. Note that the write doesn't go through either (n: 0)

      {
       	"n" : 0,
       	"opTime" : Timestamp(1695160105, 3),
       	"writeErrors" : [
       		{
       			"index" : 0,
       			"code" : 64,
       			"errmsg" : "Write concern error committing internal transaction :: caused by :: waiting for replication timed out; Error details: { wtimeout: true, writeConcern: { w: 2, wtimeout: 2000, provenance: \"clientSupplied\" } }"
       		}
       	],
       	"ok" : 1,
       	"$clusterTime" : {
       		"clusterTime" : Timestamp(1695160105, 7),
       		"signature" : {
       			"hash" : BinData(0,"AAAAAAAAAAAAAAAAAAAAAAAAAAA="),
       			"keyId" : NumberLong(0)
       		}
       	},
       	"operationTime" : Timestamp(1695160105, 3)
      }
      

      This led me to wonder what happens when an actual error, like DuplicateKeyError shows up along with a WCE. The result is that the WCE is hidden (this is basically the bug from SERVER-78311):

      {
       	"n" : 0,
       	"opTime" : Timestamp(1695221129, 11),
       	"writeErrors" : [
       		{
       			"index" : 0,
       			"code" : 11000,
       			"errmsg" : "E11000 duplicate key error collection: bulk_fle.basic index: _id_ dup key: { _id: 1.0 } found value: RecordId(1)",
       			"keyPattern" : {
       				"_id" : 1
       			},
       			"keyValue" : {
       				"_id" : 1
       			},
       			"foundValue" : NumberLong(1),
       			"duplicateRid" : NumberLong(1)
       		}
       	],
       	"ok" : 1,
       	"$clusterTime" : {
       		"clusterTime" : Timestamp(1695221129, 13),
       		"signature" : {
       			"hash" : BinData(0,"AAAAAAAAAAAAAAAAAAAAAAAAAAA="),
       			"keyId" : NumberLong(0)
       		}
       	},
       	"operationTime" : Timestamp(1695221129, 11)
      }
      

      I'm looking to implement FLE + bulkWrite + WCE handling on mongos and I was looking into the existing behavior and that's when I found this.

            Assignee:
            erwin.pe@mongodb.com Erwin Pe
            Reporter:
            vishnu.kaushik@mongodb.com Vishnu Kaushik
            Votes:
            0 Vote for this issue
            Watchers:
            7 Start watching this issue

              Created:
              Updated:
              Resolved: