Uploaded image for project: 'WiredTiger'
  1. WiredTiger
  2. WT-2012

Better Encryption API Error Message

    Details

    • Type: Improvement
    • Status: Resolved
    • Priority: Major - P3
    • Resolution: Fixed
    • Affects Version/s: None
    • Fix Version/s: WT2.7.0
    • Labels:
      None
    • # Replies:
      26
    • Last comment by Customer:
      true

      Description

      From Robert Guo:

      Andreas and I wanted to get your opinion on a potentially confusing error message in the encryptor code that's likely going to be very common for users. It's this line here: https://github.com/mongodb/mongo/blob/master/src/third_party/wiredtiger/src/conn/conn_api.c#L391

      The error says "WiredTiger (22) [1437079556:927507][74761:0x7fff73142000], connection: unknown encryptor 'aes': Invalid argument" and is triggered when the starts mongod with encryption first, then restart it without.

      This might be confusing for users who most likely wouldn't be aware of encryption at all. I wanted to discuss with you on how we'd like to handle this situation. One solution might be to add a check to see if an encryptor is provided at all. If not, we can give a different error message.

      [~andreas.nilsson@10gen.com] says:

      Additionally we're having some problems with detecting if a database have been opened without encryption before. I think we need to abort starting with encryption enabled since it puts the data files in some undefined state.

      It seems like WT is happy just starting up with encryption callbacks even if it wasn't before right?

        Issue Links

          Activity

          Hide
          donald.anderson Donald Anderson added a comment -

          Hi Robert Guo,

          Thanks for the instructions. When I start with encryption (after an unencrypted run), I see the message:

          Unable to initialize encryption. The system has previously been started without encryption enabled.

          This is not from WT, so apparently mongod has detected the problem before calling WT. I put a breakpoint a breakpoint in __wt_bt_read, which should be called when any block is read in WT (this is where the WT message is generated), and that breakpoint is not hit. Some more time in gdb revealed that wiredtiger_open is being called, and the encryptors are being configured when the error occurs. I'm guessing the message is being generated when the customize function is called. That's the best I can tell at this point as the ./mondod executable does not have line number info. The customize function for the encryptor is called before the first block is read, which is why we don't see the new WT message. The message might still appear in those times when the customize function succeeds, and there are one or more unencrypted files that appear the db directory during an encrypted run. That might happen with a sysadmin error - perhaps a partial restore of a directory from a previous run.

          Some output from runs and gdb below:

          [dda@ip-10-9-176-48 bin]$ ./mongod -vvvv
          ...
          [dda@ip-10-9-176-48 bin]$ ./mongod -vvvv --enableEncryption --encryptionKeyFile someKey
          2015-08-11T18:03:40.888+0000 D - [main] tcmallocPoolSize: 1073741824
          2015-08-11T18:03:40.896+0000 D NETWORK [initandlisten] fd limit hard:64000 soft:64000 max conn: 51200
          2015-08-11T18:03:40.917+0000 I - [initandlisten] Detected data files in /data/db created by the 'wiredTiger' storage engine, so setting the active storage engine to 'wiredTiger'.
          2015-08-11T18:03:40.917+0000 I STORAGE [initandlisten] wiredtiger_open config: create,cache_size=8G,session_max=20000,eviction=(threads_max=4),statistics=(fast),log=(enabled=true,archive=true,path=journal,compressor=snappy),file_manager=(close_idle_time=100000),checkpoint=(wait=60,log_size=2GB),statistics_log=(wait=0),extensions=[local=(entry=mongo_addWiredTigerEncryptors)],encryption=(name=aes,keyid=system),
          2015-08-11T18:03:40.936+0000 E STORAGE [initandlisten] Unable to initialize encryption. The system has previously been started without encryption enabled.
          2015-08-11T18:03:40.936+0000 I - [initandlisten] Fatal Assertion 28561
          2015-08-11T18:03:40.936+0000 I - [initandlisten]

          ***aborting after fassert() failure

          [dda@ip-10-9-176-48 bin]$ gdb --args ./mongod --enableEncryption --encryptionKeyFile someKey
          ....
          (no debugging symbols found)...done.
          (gdb) b __wt_bt_read
          Breakpoint 2 at 0x18771d4
          (gdb) r
          Starting program: /home/dda/mongodb-linux-x86_64-enterprise-rhel70-3.1.7-pre-/bin/./mongod --enableEncryption --encryptionKeyFile someKey
          [Thread debugging using libthread_db enabled]
          Using host libthread_db library "/lib64/libthread_db.so.1".
          [New Thread 0x7ffff003a700 (LWP 2295)]
          2015-08-11T18:09:20.203+0000 I - [initandlisten] Detected data files in /data/db created by the 'wiredTiger' storage engine, so setting the active storage engine to 'wiredTiger'.
          2015-08-11T18:09:20.203+0000 I STORAGE [initandlisten] wiredtiger_open config: create,cache_size=8G,session_max=20000,eviction=(threads_max=4),statistics=(fast),log=(enabled=true,archive=true,path=journal,compressor=snappy),file_manager=(close_idle_time=100000),checkpoint=(wait=60,log_size=2GB),statistics_log=(wait=0),extensions=[local=(entry=mongo_addWiredTigerEncryptors)],encryption=(name=aes,keyid=system),
          2015-08-11T18:09:20.221+0000 E STORAGE [initandlisten] Unable to initialize encryption. The system has previously been started without encryption enabled.
          2015-08-11T18:09:20.222+0000 I - [initandlisten] Fatal Assertion 28561

          Program received signal SIGTRAP, Trace/breakpoint trap.
          0x00007ffff520fffb in raise () from /lib64/libpthread.so.0
          Missing separate debuginfos, use: debuginfo-install bzip2-libs-1.0.6-12.el7.x86_64 cyrus-sasl-gssapi-2.1.26-17.el7.x86_64 cyrus-sasl-lib-2.1.26-17.el7.x86_64 cyrus-sasl-md5-2.1.26-17.el7.x86_64 cyrus-sasl-plain-2.1.26-17.el7.x86_64 elfutils-libelf-0.160-1.el7.x86_64 glibc-2.17-78.el7.x86_64 keyutils-libs-1.5.8-3.el7.x86_64 krb5-libs-1.12.2-14.el7.x86_64 libacl-2.2.51-12.el7.x86_64 libattr-2.4.46-12.el7.x86_64 libcap-2.22-8.el7.x86_64 libcom_err-1.42.9-7.el7.x86_64 libdb-5.3.21-17.el7_0.1.x86_64 libgcc-4.8.3-9.el7.x86_64 libselinux-2.2.2-6.el7.x86_64 libstdc++-4.8.3-9.el7.x86_64 lm_sensors-libs-3.3.4-11.el7.x86_64 lua-5.1.4-14.el7.x86_64 net-snmp-agent-libs-5.7.2-20.el7.x86_64 net-snmp-libs-5.7.2-20.el7.x86_64 nspr-4.10.2-4.el7.x86_64 nss-3.15.4-6.el7.x86_64 nss-softokn-freebl-3.16.2.3-9.el7.x86_64 nss-util-3.15.4-2.el7.x86_64 openssl-libs-1.0.1e-42.el7_1.9.x86_64 pcre-8.32-14.el7.x86_64 perl-libs-5.16.3-283.el7.x86_64 popt-1.13-16.el7.x86_64 rpm-libs-4.11.1-25.el7.x86_64 tcp_wrappers-libs-7.6-77.el7.x86_64 xz-libs-5.1.2-8alpha.el7.x86_64 zlib-1.2.7-13.el7.x86_64
          (gdb) quit

          Show
          donald.anderson Donald Anderson added a comment - Hi Robert Guo , Thanks for the instructions. When I start with encryption (after an unencrypted run), I see the message: Unable to initialize encryption. The system has previously been started without encryption enabled. This is not from WT, so apparently mongod has detected the problem before calling WT. I put a breakpoint a breakpoint in __wt_bt_read, which should be called when any block is read in WT (this is where the WT message is generated), and that breakpoint is not hit. Some more time in gdb revealed that wiredtiger_open is being called, and the encryptors are being configured when the error occurs. I'm guessing the message is being generated when the customize function is called. That's the best I can tell at this point as the ./mondod executable does not have line number info. The customize function for the encryptor is called before the first block is read, which is why we don't see the new WT message. The message might still appear in those times when the customize function succeeds, and there are one or more unencrypted files that appear the db directory during an encrypted run. That might happen with a sysadmin error - perhaps a partial restore of a directory from a previous run. Some output from runs and gdb below: [dda@ip-10-9-176-48 bin] $ ./mongod -vvvv ... [dda@ip-10-9-176-48 bin] $ ./mongod -vvvv --enableEncryption --encryptionKeyFile someKey 2015-08-11T18:03:40.888+0000 D - [main] tcmallocPoolSize: 1073741824 2015-08-11T18:03:40.896+0000 D NETWORK [initandlisten] fd limit hard:64000 soft:64000 max conn: 51200 2015-08-11T18:03:40.917+0000 I - [initandlisten] Detected data files in /data/db created by the 'wiredTiger' storage engine, so setting the active storage engine to 'wiredTiger'. 2015-08-11T18:03:40.917+0000 I STORAGE [initandlisten] wiredtiger_open config: create,cache_size=8G,session_max=20000,eviction=(threads_max=4),statistics=(fast),log=(enabled=true,archive=true,path=journal,compressor=snappy),file_manager=(close_idle_time=100000),checkpoint=(wait=60,log_size=2GB),statistics_log=(wait=0),extensions= [local=(entry=mongo_addWiredTigerEncryptors)] ,encryption=(name=aes,keyid=system), 2015-08-11T18:03:40.936+0000 E STORAGE [initandlisten] Unable to initialize encryption. The system has previously been started without encryption enabled. 2015-08-11T18:03:40.936+0000 I - [initandlisten] Fatal Assertion 28561 2015-08-11T18:03:40.936+0000 I - [initandlisten] ***aborting after fassert() failure [dda@ip-10-9-176-48 bin] $ gdb --args ./mongod --enableEncryption --encryptionKeyFile someKey .... (no debugging symbols found)...done. (gdb) b __wt_bt_read Breakpoint 2 at 0x18771d4 (gdb) r Starting program: /home/dda/mongodb-linux-x86_64-enterprise-rhel70-3.1.7-pre-/bin/./mongod --enableEncryption --encryptionKeyFile someKey [Thread debugging using libthread_db enabled] Using host libthread_db library "/lib64/libthread_db.so.1". [New Thread 0x7ffff003a700 (LWP 2295)] 2015-08-11T18:09:20.203+0000 I - [initandlisten] Detected data files in /data/db created by the 'wiredTiger' storage engine, so setting the active storage engine to 'wiredTiger'. 2015-08-11T18:09:20.203+0000 I STORAGE [initandlisten] wiredtiger_open config: create,cache_size=8G,session_max=20000,eviction=(threads_max=4),statistics=(fast),log=(enabled=true,archive=true,path=journal,compressor=snappy),file_manager=(close_idle_time=100000),checkpoint=(wait=60,log_size=2GB),statistics_log=(wait=0),extensions= [local=(entry=mongo_addWiredTigerEncryptors)] ,encryption=(name=aes,keyid=system), 2015-08-11T18:09:20.221+0000 E STORAGE [initandlisten] Unable to initialize encryption. The system has previously been started without encryption enabled. 2015-08-11T18:09:20.222+0000 I - [initandlisten] Fatal Assertion 28561 Program received signal SIGTRAP, Trace/breakpoint trap. 0x00007ffff520fffb in raise () from /lib64/libpthread.so.0 Missing separate debuginfos, use: debuginfo-install bzip2-libs-1.0.6-12.el7.x86_64 cyrus-sasl-gssapi-2.1.26-17.el7.x86_64 cyrus-sasl-lib-2.1.26-17.el7.x86_64 cyrus-sasl-md5-2.1.26-17.el7.x86_64 cyrus-sasl-plain-2.1.26-17.el7.x86_64 elfutils-libelf-0.160-1.el7.x86_64 glibc-2.17-78.el7.x86_64 keyutils-libs-1.5.8-3.el7.x86_64 krb5-libs-1.12.2-14.el7.x86_64 libacl-2.2.51-12.el7.x86_64 libattr-2.4.46-12.el7.x86_64 libcap-2.22-8.el7.x86_64 libcom_err-1.42.9-7.el7.x86_64 libdb-5.3.21-17.el7_0.1.x86_64 libgcc-4.8.3-9.el7.x86_64 libselinux-2.2.2-6.el7.x86_64 libstdc++-4.8.3-9.el7.x86_64 lm_sensors-libs-3.3.4-11.el7.x86_64 lua-5.1.4-14.el7.x86_64 net-snmp-agent-libs-5.7.2-20.el7.x86_64 net-snmp-libs-5.7.2-20.el7.x86_64 nspr-4.10.2-4.el7.x86_64 nss-3.15.4-6.el7.x86_64 nss-softokn-freebl-3.16.2.3-9.el7.x86_64 nss-util-3.15.4-2.el7.x86_64 openssl-libs-1.0.1e-42.el7_1.9.x86_64 pcre-8.32-14.el7.x86_64 perl-libs-5.16.3-283.el7.x86_64 popt-1.13-16.el7.x86_64 rpm-libs-4.11.1-25.el7.x86_64 tcp_wrappers-libs-7.6-77.el7.x86_64 xz-libs-5.1.2-8alpha.el7.x86_64 zlib-1.2.7-13.el7.x86_64 (gdb) quit
          Hide
          robert.guo Robert Guo added a comment -

          Hi Donald Anderson,

          Thank you for the analysis and the breakdown. I think there's a bit of a confusion here. The case that you mentioned of starting with encryption after an unencrypted run is fine. What is problematic is the other way around.

          In the description of this ticket I mentioned "starting with encryption first, then without", which I wanted to convey starting with the encryption options on unencrypted data files, in which case the error message is confusing. I identified the problem and proposed a fix in my previous comment, which is copied below. Could you see if it's possible to use this as a starting point for investigating the issue?

          A fairly straightforward solution on the WiredTiger side might be to check the FOREACH above the existing error message here: https://github.com/wiredtiger/wiredtiger/blob/master/src/conn/conn_api.c#L391
          to see if conn->encryptqh is empty, and print something like "detected encrypted data files but the database is not started with encryption" if it is.

          Show
          robert.guo Robert Guo added a comment - Hi Donald Anderson , Thank you for the analysis and the breakdown. I think there's a bit of a confusion here. The case that you mentioned of starting with encryption after an unencrypted run is fine. What is problematic is the other way around. In the description of this ticket I mentioned "starting with encryption first, then without", which I wanted to convey starting with the encryption options on unencrypted data files, in which case the error message is confusing. I identified the problem and proposed a fix in my previous comment, which is copied below. Could you see if it's possible to use this as a starting point for investigating the issue? A fairly straightforward solution on the WiredTiger side might be to check the FOREACH above the existing error message here: https://github.com/wiredtiger/wiredtiger/blob/master/src/conn/conn_api.c#L391 to see if conn->encryptqh is empty, and print something like "detected encrypted data files but the database is not started with encryption" if it is.
          Hide
          donald.anderson Donald Anderson added a comment -

          Robert Guo, Sorry, I missed that we were back on issue #1 ! For the suggestion of making a different error message: I'm trying to think of a message that works in this case, and would also make sense for all use cases within WT. An empty conn->encryptqh may come up when the WT database is started with encryption, and the caller has not set up extensions. In the current case, the situation is being complicated by the WiredTiger.basecfg (discussed in SERVER-19100) that is leftover from the previous run.

          I think there's a more direct solution to the problem: I see (from WiredTiger.basecfg) that the function mongo_addWiredTigerEncryptors should be called when mongod is starting up (even in the non-encrypted case). This happens before the list of encryptors is checked. How about in mongo_addWiredTigerEncryptors, if encryption is not enabled (that is, if no add_encryptor calls were made), then that function returns an error and displays any message that makes sense? That's truly detecting the case when a previous run left behind encryption configuration - including an encryption callback, which is being called now, when it's not appropriate.

          If for some reason that can't work, maybe a WT message something like: "unknown encryptor 'aes'; possible encryption mismatch with previous run". I think we have to keep the "unknown encryptor 'aes'" part – that's really all this function knows that went wrong.

          Show
          donald.anderson Donald Anderson added a comment - Robert Guo , Sorry, I missed that we were back on issue #1 ! For the suggestion of making a different error message: I'm trying to think of a message that works in this case, and would also make sense for all use cases within WT. An empty conn->encryptqh may come up when the WT database is started with encryption, and the caller has not set up extensions. In the current case, the situation is being complicated by the WiredTiger.basecfg (discussed in SERVER-19100 ) that is leftover from the previous run. I think there's a more direct solution to the problem: I see (from WiredTiger.basecfg ) that the function mongo_addWiredTigerEncryptors should be called when mongod is starting up (even in the non-encrypted case). This happens before the list of encryptors is checked. How about in mongo_addWiredTigerEncryptors , if encryption is not enabled (that is, if no add_encryptor calls were made), then that function returns an error and displays any message that makes sense? That's truly detecting the case when a previous run left behind encryption configuration - including an encryption callback, which is being called now, when it's not appropriate. If for some reason that can't work, maybe a WT message something like: "unknown encryptor 'aes'; possible encryption mismatch with previous run" . I think we have to keep the "unknown encryptor 'aes'" part – that's really all this function knows that went wrong.
          Hide
          andreas.nilsson Andreas Nilsson (Inactive) added a comment -

          We'll print an error message in the encryptor initialization callback if enableEncryption is not set. That should cover starting unencrypted with encrypted datafiles.

          Show
          andreas.nilsson Andreas Nilsson (Inactive) added a comment - We'll print an error message in the encryptor initialization callback if enableEncryption is not set. That should cover starting unencrypted with encrypted datafiles.
          Hide
          donald.anderson Donald Anderson added a comment -

          I think both issue #1 and #2 are resolved at this point.

          Show
          donald.anderson Donald Anderson added a comment - I think both issue #1 and #2 are resolved at this point.

            People

            • Votes:
              0 Vote for this issue
              Watchers:
              6 Start watching this issue

              Dates

              • Created:
                Updated:
                Resolved:
                Days since reply:
                1 year, 44 weeks, 6 days ago
                Date of 1st Reply: