Uploaded image for project: 'WiredTiger'
  1. WiredTiger
  2. WT-2037

Encryption API with bad key on restart tries to encrypt

    Details

    • Type: Bug
    • Status: Resolved
    • Priority: Major - P3
    • Resolution: Fixed
    • Affects Version/s: None
    • Fix Version/s: WT2.7.0
    • Labels:
      None
    • # Replies:
      9
    • Last comment by Customer:
      true
    • Epic Link:

      Description

      Andreas has a test where if he restarts with an invalid/different system key it fails to decrypt and the callback decrypt returns EINVAL. However, WT also makes an encrypt call before that error is returned and something is written, encrypted to disk, with the wrong key.

      I have asked Andreas to get a stack trace from the encrypt call to see where it is coming from and why it can happen before the EINVAL error gets back out.

        Issue Links

          Activity

          Hide
          sue.loverso Sue LoVerso added a comment -

          [~andreas.nilsson@10gen.com] please post a stack. Donald Anderson - for your information, if you have any ideas.

          Show
          sue.loverso Sue LoVerso added a comment - [~andreas.nilsson@10gen.com] please post a stack. Donald Anderson - for your information, if you have any ideas.
          Hide
          andreas.nilsson Andreas Nilsson (Inactive) added a comment -

          Stack trace for the failing decrypt call:

           mongod(_ZN5mongo12_GLOBAL__N_17decryptEP14__wt_encryptorP12__wt_sessionPhmS5_mPm+0x3E9) [0x104d58f69]
           mongod(__wt_decrypt+0xC3) [0x1059dbbe3]
           mongod(__wt_log_scan+0x88D) [0x10599907d]
           mongod(__wt_txn_recover+0x197) [0x1059e6cf7]
           mongod(__wt_connection_workers+0x3C) [0x1059766dc]
           mongod(wiredtiger_open+0x1885) [0x10596ec65]
           mongod(_ZN5mongo20EncryptionKeyManager18_initLocalKeyStoreEv+0x227) [0x104d49567]
          

          Stack trace for the "succeeding" encrypt call:

          mongod(_ZN5mongo12_GLOBAL__N_17encryptEP14__wt_encryptorP12__wt_sessionPhmS5_mPm+0x40F) [0x1048709ff]
          mongod(__wt_encrypt+0x4A) [0x1054f3c7a]
          mongod(__wt_log_write+0xE4) [0x1054b17f4]
          mongod(__wt_txn_checkpoint_log+0x596) [0x1054fd1d6]
          mongod(__wt_connection_close+0x208) [0x10548e208]
          mongod(wiredtiger_open+0x4FE) [0x1054858ee]
          mongod(_ZN5mongo20EncryptionKeyManager18_initLocalKeyStoreEv+0x227) [0x104861587]
          

          Show
          andreas.nilsson Andreas Nilsson (Inactive) added a comment - Stack trace for the failing decrypt call: mongod(_ZN5mongo12_GLOBAL__N_17decryptEP14__wt_encryptorP12__wt_sessionPhmS5_mPm+0x3E9) [0x104d58f69] mongod(__wt_decrypt+0xC3) [0x1059dbbe3] mongod(__wt_log_scan+0x88D) [0x10599907d] mongod(__wt_txn_recover+0x197) [0x1059e6cf7] mongod(__wt_connection_workers+0x3C) [0x1059766dc] mongod(wiredtiger_open+0x1885) [0x10596ec65] mongod(_ZN5mongo20EncryptionKeyManager18_initLocalKeyStoreEv+0x227) [0x104d49567] Stack trace for the "succeeding" encrypt call: mongod(_ZN5mongo12_GLOBAL__N_17encryptEP14__wt_encryptorP12__wt_sessionPhmS5_mPm+0x40F) [0x1048709ff] mongod(__wt_encrypt+0x4A) [0x1054f3c7a] mongod(__wt_log_write+0xE4) [0x1054b17f4] mongod(__wt_txn_checkpoint_log+0x596) [0x1054fd1d6] mongod(__wt_connection_close+0x208) [0x10548e208] mongod(wiredtiger_open+0x4FE) [0x1054858ee] mongod(_ZN5mongo20EncryptionKeyManager18_initLocalKeyStoreEv+0x227) [0x104861587]
          Hide
          andreas.nilsson Andreas Nilsson (Inactive) added a comment - - edited

          Did some more investigations, the issue only occurs under the following conditions:

          • include transaction_sync the config

            transaction_sync=(enabled=true,method=fsync)
            

          • close the initial session right after calling session->create without checkpointing

            session->create(...)
            session->close(..)
            

          If I remove the transaction_sync from the configuration OR checkpoint before closing the first session, encrypt is never called. Moreover the initial decrypt size is 32 in the "correct" case and 144 in the failing one so there is some substantial behavioral change when checkpointing vs transaction syncing.

          Show
          andreas.nilsson Andreas Nilsson (Inactive) added a comment - - edited Did some more investigations, the issue only occurs under the following conditions: include transaction_sync the config transaction_sync=(enabled=true,method=fsync) close the initial session right after calling session->create without checkpointing session->create(...) session->close(..) If I remove the transaction_sync from the configuration OR checkpoint before closing the first session, encrypt is never called. Moreover the initial decrypt size is 32 in the "correct" case and 144 in the failing one so there is some substantial behavioral change when checkpointing vs transaction syncing.
          Hide
          sue.loverso Sue LoVerso added a comment -

          I see how the error path can cause that to happen. I'll create a branch with a fix for you to try.

          Show
          sue.loverso Sue LoVerso added a comment - I see how the error path can cause that to happen. I'll create a branch with a fix for you to try.
          Hide
          sue.loverso Sue LoVerso added a comment -

          [~andreas.nilsson@10gen.com] can you try the recovery-error-path branch (it is based of our current develop branch) and see if it helps. If recovery gets an error, it should avoid trying to write that checkpoint record.

          Show
          sue.loverso Sue LoVerso added a comment - [~andreas.nilsson@10gen.com] can you try the recovery-error-path branch (it is based of our current develop branch) and see if it helps. If recovery gets an error, it should avoid trying to write that checkpoint record.
          Hide
          andreas.nilsson Andreas Nilsson (Inactive) added a comment -

          Verified that this branch fixed the problem and can be merged into the main tree. Thx for the quick turnaround!

          Show
          andreas.nilsson Andreas Nilsson (Inactive) added a comment - Verified that this branch fixed the problem and can be merged into the main tree. Thx for the quick turnaround!
          Hide
          xgen-internal-githook Githook User added a comment -

          Author:

          {u'username': u'sueloverso', u'name': u'Susan LoVerso', u'email': u'sue@wiredtiger.com'}

          Message: WT-2037 Only write a checkpoint to the log on close if it wasn't
          recovery with an error.
          Branch: develop
          https://github.com/wiredtiger/wiredtiger/commit/99e207dab37eef5884b5801310b813c6bab1fd96

          Show
          xgen-internal-githook Githook User added a comment - Author: {u'username': u'sueloverso', u'name': u'Susan LoVerso', u'email': u'sue@wiredtiger.com'} Message: WT-2037 Only write a checkpoint to the log on close if it wasn't recovery with an error. Branch: develop https://github.com/wiredtiger/wiredtiger/commit/99e207dab37eef5884b5801310b813c6bab1fd96
          Hide
          xgen-internal-githook Githook User added a comment -

          Author:

          {u'username': u'michaelcahill', u'name': u'Michael Cahill', u'email': u'michael.cahill@mongodb.com'}

          Message: Merge pull request #2114 from wiredtiger/recovery-error-path

          WT-2037 Only write a checkpoint to the log on close if it wasn't
          Branch: develop
          https://github.com/wiredtiger/wiredtiger/commit/c948fbb41852f5d2f82ad6f814b283d506b8a7c0

          Show
          xgen-internal-githook Githook User added a comment - Author: {u'username': u'michaelcahill', u'name': u'Michael Cahill', u'email': u'michael.cahill@mongodb.com'} Message: Merge pull request #2114 from wiredtiger/recovery-error-path WT-2037 Only write a checkpoint to the log on close if it wasn't Branch: develop https://github.com/wiredtiger/wiredtiger/commit/c948fbb41852f5d2f82ad6f814b283d506b8a7c0
          Hide
          andreas.nilsson Andreas Nilsson (Inactive) added a comment -

          I'm assuming I can watch SERVER-19532 to see when this change is dropped into the mongo tree?

          Show
          andreas.nilsson Andreas Nilsson (Inactive) added a comment - I'm assuming I can watch SERVER-19532 to see when this change is dropped into the mongo tree?

            People

            • Votes:
              0 Vote for this issue
              Watchers:
              3 Start watching this issue

              Dates

              • Created:
                Updated:
                Resolved:
                Days since reply:
                1 year, 41 weeks, 6 days ago
                Date of 1st Reply: