Uploaded image for project: 'MongoDB ETL Tools'
  1. MongoDB ETL Tools
  2. TOOLS-1676

Mongodump fails with OpenSSL 0.9.8

    XMLWordPrintable

    Details

    • Type: Improvement
    • Status: Closed
    • Priority: Major - P3
    • Resolution: Fixed
    • Affects Version/s: None
    • Fix Version/s: 3.4.6, 3.6.0-rc0
    • Component/s: mongodump
    • Labels:
      None

      Description

      With mongodump using OpenSSL 0.9.8 and --numParallelCollections >1 The server occasionally rejects mongodump's connections with:

      2017-06-01T13:42:45.105-0700 E NETWORK  [conn8] SSL: error:0407006A:rsa routines:RSA_padding_check_PKCS1_type_1:block type is not 01
      

      This causes mongodump to fail with "Closed explicitly" error message. I think this issue is the concurrent use of a single SSL context struct in the SSLDBConnector.

      Upgrading to OpenSSL 1.0.1, or setting --numParallelCollections=1, or placing a mutex around the call to openssl.Dial (which uses a shared SSL context) all seem to prevent the SSL errors.

      diff --git a/vendor/github.com/mongodb/mongo-tools/common/db/openssl/openssl.go b/vendor/github.com/mongodb/mongo-tools/common/db/openssl/openssl.go
      index 3168c77..d9fe06b 100644
      --- a/vendor/github.com/mongodb/mongo-tools/common/db/openssl/openssl.go
      +++ b/vendor/github.com/mongodb/mongo-tools/common/db/openssl/openssl.go
      @@ -5,6 +5,7 @@ import (
              "fmt"
              "net"
              "time"
      +       "sync"
       
              "github.com/mongodb/mongo-tools/common/db/kerberos"
              "github.com/mongodb/mongo-tools/common/log"
      @@ -39,11 +40,14 @@ func (self *SSLDBConnector) Configure(opts options.ToolOptions) error {
                      flags = openssl.InsecureSkipHostVerification
              }
              // create the dialer func that will be used to connect
      +       dialerMutex := sync.Mutex{}
              dialer := func(addr *mgo.ServerAddr) (net.Conn, error) {
      +               dialerMutex.Lock()
      +               defer dialerMutex.Unlock()
                      conn, err := openssl.Dial("tcp", addr.String(), self.ctx, flags)
                      if err != nil {
                              // mgo discards
      

      Here's an example of mongodump failing with 0.9.8:

      $ otool -L ~/Downloads/mongodb-osx-x86_64-3.4.2/bin/mongodump
      /Users/shane/Downloads/mongodb-osx-x86_64-3.4.2/bin/mongodump:
      	/usr/lib/libssl.0.9.8.dylib (compatibility version 0.9.8, current version 0.9.8)
      	/usr/lib/libcrypto.0.9.8.dylib (compatibility version 0.9.8, current version 0.9.8)
      	/System/Library/Frameworks/CoreFoundation.framework/Versions/A/CoreFoundation (compatibility version 150.0.0, current version 1151.16.0)
      	/System/Library/Frameworks/Foundation.framework/Versions/C/Foundation (compatibility version 300.0.0, current version 1151.16.0)
      	/System/Library/Frameworks/Security.framework/Versions/A/Security (compatibility version 1.0.0, current version 57031.1.35)
      	/usr/lib/libSystem.B.dylib (compatibility version 1.0.0, current version 1213.0.0)
      $ ~/Downloads/mongodb-osx-x86_64-3.4.2/bin/mongodump --version
      mongodump version: r3.4.2
      git version: 3f76e40c105fc223b3e5aac3e20dcd026b83b38b
      Go version: go1.7
         os: darwin
         arch: amd64
         compiler: gc
      OpenSSL version: OpenSSL 0.9.8zc 19 Mar 2015
      $ ~/Downloads/mongodb-osx-x86_64-3.4.2/bin/mongodump --sslPEMKeyFile=/Users/shane/git/mongo-python-driver/test/certificates/client.pem --ssl --sslCAFile=/Users/shane/git/mongo-python-driver/test/certificates/ca.pem --numParallelCollections=8
      2017-06-01T13:42:45.057-0700	writing admin.system.version to
      2017-06-01T13:42:45.058-0700	done dumping admin.system.version (1 document)
      2017-06-01T13:42:45.058-0700	writing test.foo to
      2017-06-01T13:42:45.059-0700	writing test.foo3 to
      2017-06-01T13:42:45.059-0700	writing test.foo2 to
      2017-06-01T13:42:45.059-0700	writing test.foo4 to
      2017-06-01T13:42:45.059-0700	writing test.testing to
      2017-06-01T13:42:45.059-0700	writing test.bits to
      2017-06-01T13:42:45.059-0700	writing command-monitoring-tests.test to
      2017-06-01T13:42:45.059-0700	writing test.sess to
      2017-06-01T13:42:45.072-0700	done dumping test.sess (2 documents)
      2017-06-01T13:42:45.073-0700	writing test.test to
      2017-06-01T13:42:45.074-0700	done dumping test.test (2 documents)
      2017-06-01T13:42:45.074-0700	writing test.test.sdfsystem.indexes to
      2017-06-01T13:42:45.075-0700	done dumping test.test.sdfsystem.indexes (1 document)
      2017-06-01T13:42:45.075-0700	writing test.dotField to
      2017-06-01T13:42:45.076-0700	done dumping test.dotField (1 document)
      2017-06-01T13:42:45.076-0700	writing test.test.systm.indexes to
      2017-06-01T13:42:45.076-0700	done dumping test.test.systm.indexes (1 document)
      2017-06-01T13:42:45.076-0700	writing admin.test to
      2017-06-01T13:42:45.077-0700	done dumping admin.test (1 document)
      2017-06-01T13:42:45.077-0700	writing test.ambiguous to
      2017-06-01T13:42:45.078-0700	done dumping test.ambiguous (1 document)
      2017-06-01T13:42:45.078-0700	writing foo.test to
      2017-06-01T13:42:45.079-0700	done dumping foo.test (1 document)
      2017-06-01T13:42:45.079-0700	writing admin.test34 to
      2017-06-01T13:42:45.079-0700	done dumping admin.test34 (1 document)
      2017-06-01T13:42:45.079-0700	writing test.sessio123 to
      2017-06-01T13:42:45.080-0700	done dumping test.sessio123 (0 documents)
      2017-06-01T13:42:45.080-0700	writing test.test1234 to
      2017-06-01T13:42:45.081-0700	done dumping test.test1234 (0 documents)
      2017-06-01T13:42:45.081-0700	writing test.DisableIdIndex to
      2017-06-01T13:42:45.082-0700	done dumping test.testing (200 documents)
      2017-06-01T13:42:45.082-0700	writing test.applyOpsCreated to
      2017-06-01T13:42:45.082-0700	done dumping test.DisableIdIndex (0 documents)
      2017-06-01T13:42:45.082-0700	writing test.sessions to
      2017-06-01T13:42:45.083-0700	done dumping test.applyOpsCreated (0 documents)
      2017-06-01T13:42:45.083-0700	writing test.newColl to
      2017-06-01T13:42:45.083-0700	done dumping test.sessions (0 documents)
      2017-06-01T13:42:45.083-0700	writing test.noidindex to
      2017-06-01T13:42:45.084-0700	done dumping test.newColl (0 documents)
      2017-06-01T13:42:45.084-0700	writing test_lazy_connect_w0.test to
      2017-06-01T13:42:45.085-0700	done dumping test.noidindex (0 documents)
      2017-06-01T13:42:45.085-0700	done dumping test_lazy_connect_w0.test (0 documents)
      2017-06-01T13:42:45.100-0700	done dumping command-monitoring-tests.test (4 documents)
      2017-06-01T13:42:45.101-0700	done dumping test.bits (5 documents)
      2017-06-01T13:42:45.105-0700	Failed: error writing data for collection `test.foo2` to disk: error reading collection: Closed explicitly
      

      And the server logs:

      $ ~/Downloads/mongodb-osx-x86_64-3.4.2/bin/mongod --dbpath ~/data2 --port 27017 --replSet source --setParameter "writePeriodicNoops=false" --sslOnNormalPorts --sslPEMKeyFile=/Users/shane/git/mongo-python-driver/test/certificates/server.pem --sslCAFile=/Users/shane/git/mongo-python-driver/test/certificates/ca.pem
      2017-06-01T13:42:31.134-0700 I CONTROL  [initandlisten] MongoDB starting : pid=66854 port=27017 dbpath=/Users/shane/data2 64-bit host=Shanes-MacBook-Pro-2.local
      2017-06-01T13:42:31.134-0700 I CONTROL  [initandlisten] db version v3.4.2
      2017-06-01T13:42:31.134-0700 I CONTROL  [initandlisten] git version: 3f76e40c105fc223b3e5aac3e20dcd026b83b38b
      2017-06-01T13:42:31.134-0700 I CONTROL  [initandlisten] OpenSSL version: OpenSSL 0.9.8zh 14 Jan 2016
      2017-06-01T13:42:31.134-0700 I CONTROL  [initandlisten] allocator: system
      2017-06-01T13:42:31.134-0700 I CONTROL  [initandlisten] modules: none
      2017-06-01T13:42:31.134-0700 I CONTROL  [initandlisten] build environment:
      2017-06-01T13:42:31.134-0700 I CONTROL  [initandlisten]     distarch: x86_64
      2017-06-01T13:42:31.134-0700 I CONTROL  [initandlisten]     target_arch: x86_64
      2017-06-01T13:42:31.134-0700 I CONTROL  [initandlisten] options: { net: { port: 27017, ssl: { CAFile: "/Users/shane/git/mongo-python-driver/test/certificates/ca.pem", PEMKeyFile: "/Users/shane/git/mongo-python-driver/test/certificates/server.pem", mode: "requireSSL" } }, replication: { replSet: "source" }, setParameter: { writePeriodicNoops: "false" }, storage: { dbPath: "/Users/shane/data2" } }
      2017-06-01T13:42:31.135-0700 I -        [initandlisten] Detected data files in /Users/shane/data2 created by the 'wiredTiger' storage engine, so setting the active storage engine to 'wiredTiger'.
      2017-06-01T13:42:31.135-0700 I STORAGE  [initandlisten] wiredtiger_open config: create,cache_size=7680M,session_max=20000,eviction=(threads_max=4),config_base=false,statistics=(fast),log=(enabled=true,archive=true,path=journal,compressor=snappy),file_manager=(close_idle_time=100000),checkpoint=(wait=60,log_size=2GB),statistics_log=(wait=0),
      2017-06-01T13:42:31.747-0700 I STORAGE  [initandlisten] Starting WiredTigerRecordStoreThread local.oplog.rs
      2017-06-01T13:42:31.748-0700 I STORAGE  [initandlisten] The size storer reports that the oplog contains 1194142 records totaling to 205544829 bytes
      2017-06-01T13:42:31.750-0700 I STORAGE  [initandlisten] Sampling from the oplog between May 10 12:45:44:1b2dd and Jun  1 13:33:01:1 to determine where to place markers for truncation
      2017-06-01T13:42:31.751-0700 I STORAGE  [initandlisten] Taking 112 samples and assuming that each section of oplog contains approximately 106331 records totaling to 18302502 bytes
      2017-06-01T13:42:31.873-0700 I STORAGE  [initandlisten] Placing a marker at optime May 10 12:45:45:125a8
      2017-06-01T13:42:31.873-0700 I STORAGE  [initandlisten] Placing a marker at optime May 10 12:45:46:10218
      2017-06-01T13:42:31.873-0700 I STORAGE  [initandlisten] Placing a marker at optime May 10 13:00:31:14db0
      2017-06-01T13:42:31.873-0700 I STORAGE  [initandlisten] Placing a marker at optime May 10 13:00:32:1a6d3
      2017-06-01T13:42:31.873-0700 I STORAGE  [initandlisten] Placing a marker at optime May 10 13:00:33:12d69
      2017-06-01T13:42:31.873-0700 I STORAGE  [initandlisten] Placing a marker at optime May 10 13:00:34:f3a4
      2017-06-01T13:42:31.873-0700 I STORAGE  [initandlisten] Placing a marker at optime May 10 13:00:35:e1b5
      2017-06-01T13:42:31.873-0700 I STORAGE  [initandlisten] Placing a marker at optime May 10 13:00:36:f61c
      2017-06-01T13:42:31.873-0700 I STORAGE  [initandlisten] Placing a marker at optime May 10 13:00:37:1062b
      2017-06-01T13:42:31.873-0700 I STORAGE  [initandlisten] Placing a marker at optime May 10 13:00:38:d3c3
      2017-06-01T13:42:31.873-0700 I STORAGE  [initandlisten] Placing a marker at optime May 10 13:00:39:ef5b
      2017-06-01T13:42:31.933-0700 I CONTROL  [initandlisten]
      2017-06-01T13:42:31.933-0700 I CONTROL  [initandlisten] ** WARNING: Access control is not enabled for the database.
      2017-06-01T13:42:31.933-0700 I CONTROL  [initandlisten] **          Read and write access to data and configuration is unrestricted.
      2017-06-01T13:42:31.933-0700 I CONTROL  [initandlisten]
      2017-06-01T13:42:31.946-0700 I STORAGE  [initandlisten] WARNING: the collection 'test.DisableIdIndex' lacks a unique index on _id. This index is needed for replication to function properly
      2017-06-01T13:42:31.946-0700 I STORAGE  [initandlisten] 	 To fix this, you need to create a unique index on _id. See http://dochub.mongodb.org/core/build-replica-set-indexes
      2017-06-01T13:42:31.947-0700 I FTDC     [initandlisten] Initializing full-time diagnostic data capture with directory '/Users/shane/data2/diagnostic.data'
      2017-06-01T13:42:31.950-0700 I NETWORK  [thread1] waiting for connections on port 27017 ssl
      2017-06-01T13:42:31.950-0700 I REPL     [replExecDBWorker-0] New replica set config in use: { _id: "source", version: 1, protocolVersion: 1, members: [ { _id: 0, host: "localhost:27017", arbiterOnly: false, buildIndexes: true, hidden: false, priority: 1.0, tags: {}, slaveDelay: 0, votes: 1 } ], settings: { chainingAllowed: true, heartbeatIntervalMillis: 2000, heartbeatTimeoutSecs: 10, electionTimeoutMillis: 10000, catchUpTimeoutMillis: 2000, getLastErrorModes: {}, getLastErrorDefaults: { w: 1, wtimeout: 0 }, replicaSetId: ObjectId('58e3d69214ae9e80da19e183') } }
      2017-06-01T13:42:31.950-0700 I REPL     [replExecDBWorker-0] This node is localhost:27017 in the config
      2017-06-01T13:42:31.950-0700 I REPL     [replExecDBWorker-0] transition to STARTUP2
      2017-06-01T13:42:31.950-0700 I REPL     [replExecDBWorker-0] Starting replication storage threads
      2017-06-01T13:42:31.951-0700 I REPL     [replExecDBWorker-0] Starting replication fetcher thread
      2017-06-01T13:42:31.952-0700 I REPL     [replExecDBWorker-0] Starting replication applier thread
      2017-06-01T13:42:31.952-0700 I REPL     [replExecDBWorker-0] Starting replication reporter thread
      2017-06-01T13:42:31.952-0700 I REPL     [rsSync] transition to RECOVERING
      2017-06-01T13:42:31.953-0700 I REPL     [rsSync] transition to SECONDARY
      2017-06-01T13:42:31.953-0700 I REPL     [rsSync] conducting a dry run election to see if we could be elected
      2017-06-01T13:42:31.953-0700 I REPL     [ReplicationExecutor] dry election run succeeded, running for election
      2017-06-01T13:42:31.963-0700 I REPL     [ReplicationExecutor] election succeeded, assuming primary role in term 66
      2017-06-01T13:42:31.963-0700 I REPL     [ReplicationExecutor] transition to PRIMARY
      2017-06-01T13:42:31.963-0700 I REPL     [ReplicationExecutor] Could not access any nodes within timeout when checking for additional ops to apply before finishing transition to primary. Will move forward with becoming primary anyway.
      2017-06-01T13:42:32.955-0700 I REPL     [rsSync] transition to primary complete; database writes are now permitted
      2017-06-01T13:42:45.010-0700 I NETWORK  [thread1] connection accepted from 127.0.0.1:62056 #1 (1 connection now open)
      2017-06-01T13:42:45.029-0700 I NETWORK  [thread1] connection accepted from 127.0.0.1:62058 #2 (2 connections now open)
      2017-06-01T13:42:45.060-0700 I NETWORK  [thread1] connection accepted from 127.0.0.1:62064 #3 (3 connections now open)
      2017-06-01T13:42:45.070-0700 I NETWORK  [thread1] connection accepted from 127.0.0.1:62065 #4 (4 connections now open)
      2017-06-01T13:42:45.081-0700 I NETWORK  [thread1] connection accepted from 127.0.0.1:62066 #5 (5 connections now open)
      2017-06-01T13:42:45.090-0700 I NETWORK  [thread1] connection accepted from 127.0.0.1:62067 #6 (6 connections now open)
      2017-06-01T13:42:45.091-0700 I NETWORK  [thread1] connection accepted from 127.0.0.1:62068 #7 (7 connections now open)
      2017-06-01T13:42:45.093-0700 I NETWORK  [thread1] connection accepted from 127.0.0.1:62070 #8 (8 connections now open)
      2017-06-01T13:42:45.105-0700 E NETWORK  [conn8] SSL: error:0407006A:rsa routines:RSA_padding_check_PKCS1_type_1:block type is not 01
      2017-06-01T13:42:45.105-0700 I -        [conn8] end connection 127.0.0.1:62070 (8 connections now open)
      2017-06-01T13:42:45.105-0700 I -        [conn4] end connection 127.0.0.1:62065 (7 connections now open)
      2017-06-01T13:42:45.105-0700 I -        [conn3] end connection 127.0.0.1:62064 (7 connections now open)
      2017-06-01T13:42:45.105-0700 I -        [conn5] end connection 127.0.0.1:62066 (6 connections now open)
      2017-06-01T13:42:45.106-0700 I -        [conn6] end connection 127.0.0.1:62067 (4 connections now open)
      2017-06-01T13:42:45.109-0700 I -        [conn7] end connection 127.0.0.1:62068 (3 connections now open)
      2017-06-01T13:42:45.209-0700 I COMMAND  [conn1] command test.foo command: getMore { getMore: 80004256156, collection: "foo" } originatingCommand: { find: "foo", skip: 0, snapshot: true } planSummary: COLLSCAN cursorid:80004256156 keysExamined:0 docsExamined:162388 numYields:1272 nreturned:162387 reslen:16777233 locks:{ Global: { acquireCount: { r: 2546 } }, Database: { acquireCount: { r: 1273 } }, Collection: { acquireCount: { r: 1273 } } } protocol:op_query 145ms
      2017-06-01T13:42:45.213-0700 I COMMAND  [conn2] command test.foo2 command: getMore { getMore: 85102722868, collection: "foo2" } originatingCommand: { find: "foo2", skip: 0, snapshot: true } planSummary: COLLSCAN cursorid:85102722868 keysExamined:0 docsExamined:162388 numYields:1272 nreturned:162387 reslen:16777234 locks:{ Global: { acquireCount: { r: 2546 } }, Database: { acquireCount: { r: 1273 } }, Collection: { acquireCount: { r: 1273 } } } protocol:op_query 150ms
      2017-06-01T13:42:45.213-0700 I -        [conn1] AssertionException handling request, closing client connection: 6 socket exception [SEND_ERROR] for 127.0.0.1:62056
      2017-06-01T13:42:45.213-0700 I -        [conn1] end connection 127.0.0.1:62056 (2 connections now open)
      2017-06-01T13:42:45.217-0700 I -        [conn2] AssertionException handling request, closing client connection: 6 socket exception [SEND_ERROR] for 127.0.0.1:62058
      2017-06-01T13:42:45.217-0700 I -        [conn2] end connection 127.0.0.1:62058 (1 connection now open)
      

        Attachments

          Issue Links

            Activity

              People

              • Assignee:
                david.golden David Golden
                Reporter:
                shane.harvey Shane Harvey
              • Votes:
                0 Vote for this issue
                Watchers:
                4 Start watching this issue

                Dates

                • Created:
                  Updated:
                  Resolved: