Uploaded image for project: 'Core Server'
  1. Core Server
  2. SERVER-90672

Investigate potential bug in time-series insert path that allows for multiple buckets to have the same OIDs, across stripes

    • Storage Execution
    • Fully Compatible
    • v8.0, v7.3, v7.0
    • Execution Team 2024-06-10, Execution Team 2024-06-24, Execution Team 2024-07-08

      Based on my understanding, inside bucket_catalog_internal::allocateBucket which we call when we try to allocate a new bucket in memory, we check for OID collisions (the case where we are trying to allocate a new bucket with an OID that already exists) both within a stripe and across stripes. We check for OID collisions across stripes here, within bucket_state_registry::initializeBucketState.

      Within this function, we check whether there is no other bucket with the same OID as the one we generated. If there is not, we return a Status::OK. If there is one and it is currently having direct writes on it, we return a WriteConflict error code; if there is one and it is frozen, we return a TimeseriesBucketFrozen error code. We then also check that there is no preparedBatch on this bucket.

      However, it appears to me that if there is a bucket that already exists with an OID that doesn't fall into any of these cases, then we will fall through and end up with two buckets that have the same OID. This might be caught when we try adding this new bucket to our collection since it would mean having two buckets with the same _id, but it seems to me that we can catch this case earlier here, and throw a WriteConflict exception.

      This ticket would investigate whether there is a reason why the current behavior is safe, and if it there is not, make sure we this case where we are attempting to allocate a new bucket with a non-unique OID.

            dan.larkin-york@mongodb.com Dan Larkin-York
            damian.wasilewicz@mongodb.com Damian Wasilewicz
            0 Vote for this issue
            14 Start watching this issue