Uploaded image for project: 'Core Server'
  1. Core Server
  2. SERVER-21275

Document not found due to WT commit visibility issue

    Details

    • Type: Bug
    • Status: Closed
    • Priority: Critical - P2
    • Resolution: Fixed
    • Affects Version/s: 3.2.0-rc2
    • Fix Version/s: 3.0.8, 3.2.0-rc4
    • Component/s: Querying
    • Labels:
      None
    • Backwards Compatibility:
      Fully Compatible
    • Operating System:
      ALL
    • Backport Completed:
    • Sprint:
      Repl C (11/20/15), QuInt D (12/14/15)

      Description

      Issue Status as of Dec 10, 2015

      ISSUE SUMMARY
      When using the WiredTiger storage engine, a race condition may prevent locally committed documents from being immediately visible to subsequent read operations. This bug may have an impact on both server and application operations. Unless exposed by a replication problem, it is not possible to determine if a system has been impacted by this bug without significant downtime.

      USER IMPACT
      Normally, after a write is committed by the storage engine, it is immediately visible to subsequent operations. A race condition in WiredTiger may prevent a write from becoming immediately visible to subsequent operations, which may result in various problems, primarily impacting replication:

      • User writes may not be immediately visible to subsequent read operations
      • Replica set members may diverge and contain different data
      • Replication thread(s) shut down server with error message “Fatal Assertion 16360”, due to duplicate _id values (a unique index violation)

      Deployments where a WiredTiger node is or was used as a source of data may be affected. This includes:

      • replica sets where the primary node is or was running WiredTiger
      • replica sets using chained replication where any node may sync from a WiredTiger node

      MMAPv1-only deployments are not affected by this issue. Mixed storage engine deployments are not affected when WiredTiger nodes never become primary, or when WiredTiger secondaries are not used as a source for chained replication.

      WORKAROUNDS
      There are no workarounds for this issue. All MongoDB 3.0 users running the WiredTiger storage engine should upgrade to MongoDB 3.0.8. A 3.0.8-rc0 release candidate containing the fix for this issue is available for download.

      Users experiencing the "Fatal Assertion 16360" error may restart the affected node to fix the issue, but this condition may recur so upgrading to 3.0.8 is strongly recommended.

      AFFECTED VERSIONS
      MongoDB 3.0.0 through 3.0.7 using the WiredTiger storage engine. MongoDB 3.2.0 is not affected by this issue.

      FIX VERSION
      The fix is included in the 3.0.8 production release.

      Original description

      A new test is being introduced into the FSM tests to check the dbHash of the DB (and collections) on all replica set nodes, during these phases of the workload (SERVER-21115):

      • Workload completed, before invoking teardown
      • Workload completed, after invoking teardown

      Before the dbHash is computed, cluster.awaitReplication() is invoked to ensure that all nodes in the replica set have caught up.

      During the development of this test it was noticed that infrequent failures would occur for workload remove_and_bulk_insert, for wiredTiger storage.

      1. data_db_dbhash_0212513.tar.gz
        71.11 MB
        Benety Goh
      2. data_db_dbhash_20151118161433.tar.gz
        49.40 MB
        Benety Goh
      3. dbhash-remove_and_bulk_insert.js
        5 kB
        Jonathan Abrahams
      4. dbhash-remove_and_bulk_insert.js
        4 kB
        Jonathan Abrahams
      5. dbhash-remove_and_bulk_insert.js
        4 kB
        Jonathan Abrahams
      6. dbhash-remove_and_bulk_insert.js
        4 kB
        Jonathan Abrahams
      7. dbhash-remove_and_bulk_insert-wiredTiger-20151118101103.log.gz
        12 kB
        Benety Goh
      8. dbhash-remove_and_bulk_insert-wiredTiger-20151118161433.log.gz
        466 kB
        Benety Goh
      9. rbi-96.log
        55 kB
        Jonathan Abrahams
      10. run_dbhash.sh
        3 kB
        Jonathan Abrahams
      11. run_dbhash.sh
        3 kB
        Jonathan Abrahams
      12. run_dbhash.sh
        1 kB
        Jonathan Abrahams

        Issue Links

          Activity

          Hide
          jonathan.abrahams Jonathan Abrahams added a comment - - edited

          Attached is a script which can reproduce the issue. Caution, it happens infrequently, so the script has to be run multiple times. Also attached is the log from the failed run.

          It was run in this manner:

          #!/bin/bash
          # cd to root of mongo repo
          pushd ~/mongo
          f=/tmp/dbhash-remove_and_bulk_insert.log
          for i in $(seq 1 100)
          do
             mongo --nodb ~/dbhash-remove_and_bulk_insert.js | tee $f
             status=${PIPESTATUS[0]}
             if [ $status -ne 0 ]; then
                echo "Failure detected!"
                break
             else
                rm $f-$i.log
             fi
          done
          popd
          

          Show
          jonathan.abrahams Jonathan Abrahams added a comment - - edited Attached is a script which can reproduce the issue. Caution, it happens infrequently, so the script has to be run multiple times. Also attached is the log from the failed run. It was run in this manner: #!/bin/bash # cd to root of mongo repo pushd ~/mongo f=/tmp/dbhash-remove_and_bulk_insert.log for i in $(seq 1 100) do mongo --nodb ~/dbhash-remove_and_bulk_insert.js | tee $f status=${PIPESTATUS[0]} if [ $status -ne 0 ]; then echo "Failure detected!" break else rm $f-$i.log fi done popd
          Hide
          benety.goh Benety Goh added a comment -

          attached /data/db contents from failed run_dbhash.sh run under linux (git hash 0212513)

          Show
          benety.goh Benety Goh added a comment - attached /data/db contents from failed run_dbhash.sh run under linux (git hash 0212513)
          Hide
          benety.goh Benety Goh added a comment -

          logs associated with data_db_dbhash_0212513.tar.gz

          Show
          benety.goh Benety Goh added a comment - logs associated with data_db_dbhash_0212513.tar.gz
          Hide
          pasette Dan Pasette added a comment -

          Resolved as duplicate of WT-2237

          Show
          pasette Dan Pasette added a comment - Resolved as duplicate of WT-2237
          Hide
          itwebtf@saxobank.com ITWEBTF SAXOBANK added a comment -

          Today I saw a very similar issue on mmapv1 on MongoDb 3.2, a local non-clustered installation. I have tests that all write to the database, then assert on the result. I have only once seen that the value read was the value before the last write operation.

          I cannot repro, and I cannot prove that this is not a problem in our code, but the code and the tests have run unchanged for more than a year on a 2.4.9 database.

          I suppose that whatever tests you run on WiredTiger will be run on mmapv1 as well, and then you will eventually see this issue.

          Show
          itwebtf@saxobank.com ITWEBTF SAXOBANK added a comment - Today I saw a very similar issue on mmapv1 on MongoDb 3.2, a local non-clustered installation. I have tests that all write to the database, then assert on the result. I have only once seen that the value read was the value before the last write operation. I cannot repro, and I cannot prove that this is not a problem in our code, but the code and the tests have run unchanged for more than a year on a 2.4.9 database. I suppose that whatever tests you run on WiredTiger will be run on mmapv1 as well, and then you will eventually see this issue.
          Hide
          pasette Dan Pasette added a comment -

          Hi ITWEBTF SAXOBANK, we do run the same tests on MMAPv1 that we do on WT. This particular issue is limited to the WiredTiger storage engine. I would be very interested in understanding your test case, but it would be a separate issue. If you can describe your cluster details and workload setup, please do so. It would be best to get the details in a new SERVER issue though.

          Show
          pasette Dan Pasette added a comment - Hi ITWEBTF SAXOBANK , we do run the same tests on MMAPv1 that we do on WT. This particular issue is limited to the WiredTiger storage engine. I would be very interested in understanding your test case, but it would be a separate issue. If you can describe your cluster details and workload setup, please do so. It would be best to get the details in a new SERVER issue though.
          Hide
          itwebtf@saxobank.com ITWEBTF SAXOBANK added a comment -

          Hi Dan,

          The tests we run will

          • Ensure (Create) indexes on all collections once.
          • Run back-end code from the TradingFloor.com web site + two back-end consoles that do long-running tasks and process queue items.
          • Each test will delete all documents in all collections, then run code that will write/read to the database, then read and assert on the outcome.
          • Drop empty collections after running all tests.

          There are around 500 tests that run local developer machines (Intel i7 CPU, 16 GB RAM and at least 250 GB SSD) non-clustered.

          The particular test I saw failing,

          • Inserted a document (around 800 lines of Jason) with one sub-document in array field.
          • Added another sub-document to the array field.

          The failure I saw was that the added sub-document was not there.

          I hope you can use this info. Feel free to copy it to another issue if that provides better tracking. If you do so, I can add the actual documents.

          Best regards,
          Brian

          Show
          itwebtf@saxobank.com ITWEBTF SAXOBANK added a comment - Hi Dan, The tests we run will Ensure (Create) indexes on all collections once. Run back-end code from the TradingFloor.com web site + two back-end consoles that do long-running tasks and process queue items. Each test will delete all documents in all collections, then run code that will write/read to the database, then read and assert on the outcome. Drop empty collections after running all tests. There are around 500 tests that run local developer machines (Intel i7 CPU, 16 GB RAM and at least 250 GB SSD) non-clustered. The particular test I saw failing, Inserted a document (around 800 lines of Jason) with one sub-document in array field. Added another sub-document to the array field. The failure I saw was that the added sub-document was not there. I hope you can use this info. Feel free to copy it to another issue if that provides better tracking. If you do so, I can add the actual documents. Best regards, Brian

            People

            • Votes:
              0 Vote for this issue
              Watchers:
              43 Start watching this issue

              Dates

              • Created:
                Updated:
                Resolved:

                  Agile