Per the following logs in the failing test:
It can be seen that the first _configsvrReshardCollection command fails as it is supposed to with code 5356800 during oplog application. However a subsequent retry of the same command after the first one finishes causes DuplicateKey (11000) due to the global id constraint violation found in cloning phase for second reshard collection command rather than in oplog application phase. From the following logs, it can be seen that it is the _shardsvrReshardCollection command that is being retried, unlike BF-26040:
The NotWritablePrimary error is being propagated to mongos via ShardingDDLCoordinator_NORESILIENT after exhausting _configsvReshardCollection command retries which in turn leads to a retry of _shardsvrReshardCollection from mongos. Essentially, if the command is retried or sent again to primary shard from mongos then it waits to acquire ScopedDistLock before creating an instance of ReshardCollectionCoordinator. This happens after the previous operation completes fully, not allowing a retry to possibly join the existing operation.
A fix here would be to recommit
SERVER-61607 and accept DuplicateKey as a possible error code.