Uploaded image for project: 'Core Server'
  1. Core Server
  2. SERVER-85442

Collection not found in CollectionCatalog during oplog entry application on a server replica in secondary mode

    • Type: Icon: Bug Bug
    • Resolution: Works as Designed
    • Priority: Icon: Major - P3 Major - P3
    • None
    • Affects Version/s: 7.3.0-rc0
    • Component/s: None

      Context: 

      1. A patch fails consistently on the Evergreen (a failure is not in a released version of the server nor a main branch).
      2. The server is tested in multi-tenant mode.

      The patch content:

      1. Server is changed to acquire stronger tenant level lock when change stream pre-images collection is created/dropped.
      2. The test verifies that actually the lock is taken and insert operation on change stream pre-images collection blocks.
      3. There are no changes elsewhere except some instrumentation to help the investigation.

      Defect symptoms:

      1. The test jstests/serverless/change_stream_pre_images_collection_concurrency.js in the patch triggers a failure on the secondary node https://parsley.mongodb.com/resmoke/bb0cdbc06d7a6dd9c7e43d3585b6338f/test/17ab7bfbfdf347d24d5f30a1a41ed5fa?bookmarks=0%2C321%2C12276%2C15034&filters=100d25791&selectedLineRange=L12253-L12276&shareLine=12253 at a point where an update to a test collection is applied which entails writing to the change stream pre-images collection.
      2. The defect is not reliably reproducible on a workstation (I was able to do that on one instance of my workstation, but failed on a new one - thus my step to pass the investigation to owners of CollectionCatalog), but seems to consistently fail on the Evergreen. Therefore it feels like a race condition defect.
      3. It seems that CollectionCatalog state changes between https://github.com/10gen/mongo/blob/e627a7d75870a18ed4dea1f6b7d874597d45d5ed/src/mongo/db/change_stream_pre_images_collection_manager.cpp#L229-L235 and https://github.com/10gen/mongo/blob/e627a7d75870a18ed4dea1f6b7d874597d45d5ed/src/mongo/db/change_stream_pre_images_collection_manager.cpp#L242-L244 - the change stream pre-images collection does not exist at https://github.com/10gen/mongo/blob/e627a7d75870a18ed4dea1f6b7d874597d45d5ed/src/mongo/db/change_stream_pre_images_collection_manager.cpp#L229-L235, but then appears at https://github.com/10gen/mongo/blob/e627a7d75870a18ed4dea1f6b7d874597d45d5ed/src/mongo/db/change_stream_pre_images_collection_manager.cpp#L237-L239. However, the change stream pre-images collection should be present in the CollectionCatalog at all times, since it has been replicated previously. Thus two problems: the creation of change stream pre-images collection is not visible; and CollectionCatalog state seems to change when it should not. Note that change_stream_serverless_helpers::isChangeStreamEnabled() inquires the CollectionCatalog to check it the change stream pre-images collection exists.

      Hypotheses rejected:

      1. Application of oplog entries seem to be correct - creation of change stream pre-images collection happens in its own batch before the update operation is applied.

            Assignee:
            jordi.olivares-provencio@mongodb.com Jordi Olivares Provencio
            Reporter:
            mindaugas.malinauskas@mongodb.com Mindaugas Malinauskas
            Votes:
            0 Vote for this issue
            Watchers:
            6 Start watching this issue

              Created:
              Updated:
              Resolved: