Uploaded image for project: 'WiredTiger'
  1. WiredTiger
  2. WT-6984

Unable to Access Database After Unexpected Shutdown

    • Type: Icon: Bug Bug
    • Resolution: Done
    • Priority: Icon: Major - P3 Major - P3
    • None
    • Affects Version/s: None
    • Component/s: None
    • Labels:
      None

      Problem Description: 

      We had an unexpected shutdown and are unable to access our Mongo database.  We have been accessing the database from a mongo image pulled from Docker and run in a container in Rancher.  Running the container now results in a number of errors from WiredTiger suggesting data corruption.  Some of the things we've tried to resolve the issue are running mongod --repair, deleting the mongod lock file, and creating a build of WiredTiger to try to retrieve the data.  The attempt to fetch data with a WiredTiger build resulted in the same errors we saw when using mongo image.
      As well, we have tried changing the versions to 4.2 and 4.0.21 to resolve the versions error.  The error is odd to us because we didn't try to upgrade our version of MongoDB.

      Steps to Reproduce

      Start up the MongoDB container.

      Expected Results

      Mongod --repair results in the database starting up correctly and being able to access data.

      Actual Results

      We receive a set of errors from WiredTiger suggesting data corruption.

      Additional Notes

      Please see logs below:

      12/1/2020 10:17:57 PM\{"t":{"$date":"2020-12-02T06:17:57.643+00:00"},"s":"I", "c":"-", "id":20520, "ctx":"initandlisten","msg":"Stopping further Flow Control ticket acquisitions."}
      12/1/2020 10:17:57 PM\{"t":{"$date":"2020-12-02T06:17:57.643+00:00"},"s":"I", "c":"STORAGE", "id":4784908, "ctx":"initandlisten","msg":"Shutting down the PeriodicThreadToAbortExpiredTransactions"}
      12/1/2020 10:17:57 PM\{"t":{"$date":"2020-12-02T06:17:57.643+00:00"},"s":"I", "c":"STORAGE", "id":4784934, "ctx":"initandlisten","msg":"Shutting down the PeriodicThreadToDecreaseSnapshotHistoryCachePressure"}
      12/1/2020 10:17:57 PM\{"t":{"$date":"2020-12-02T06:17:57.643+00:00"},"s":"I", "c":"REPL", "id":4784909, "ctx":"initandlisten","msg":"Shutting down the ReplicationCoordinator"}
      12/1/2020 10:17:57 PM\{"t":{"$date":"2020-12-02T06:17:57.643+00:00"},"s":"I", "c":"SHARDING", "id":4784910, "ctx":"initandlisten","msg":"Shutting down the ShardingInitializationMongoD"}
      12/1/2020 10:17:57 PM\{"t":{"$date":"2020-12-02T06:17:57.643+00:00"},"s":"I", "c":"REPL", "id":4784911, "ctx":"initandlisten","msg":"Enqueuing the ReplicationStateTransitionLock for shutdown"}
      12/1/2020 10:17:57 PM\{"t":{"$date":"2020-12-02T06:17:57.643+00:00"},"s":"I", "c":"-", "id":4784912, "ctx":"initandlisten","msg":"Killing all operations for shutdown"}
      12/1/2020 10:17:57 PM\{"t":{"$date":"2020-12-02T06:17:57.643+00:00"},"s":"I", "c":"-", "id":4695300, "ctx":"initandlisten","msg":"Interrupted all currently running operations","attr":\{"opsKilled":3}}
      12/1/2020 10:17:57 PM\{"t":{"$date":"2020-12-02T06:17:57.643+00:00"},"s":"I", "c":"COMMAND", "id":4784913, "ctx":"initandlisten","msg":"Shutting down all open transactions"}
      12/1/2020 10:17:57 PM\{"t":{"$date":"2020-12-02T06:17:57.643+00:00"},"s":"I", "c":"REPL", "id":4784914, "ctx":"initandlisten","msg":"Acquiring the ReplicationStateTransitionLock for shutdown"}
      12/1/2020 10:17:57 PM\{"t":{"$date":"2020-12-02T06:17:57.643+00:00"},"s":"I", "c":"INDEX", "id":4784915, "ctx":"initandlisten","msg":"Shutting down the IndexBuildsCoordinator"}
      12/1/2020 10:17:57 PM\{"t":{"$date":"2020-12-02T06:17:57.643+00:00"},"s":"I", "c":"REPL", "id":4784916, "ctx":"initandlisten","msg":"Reacquiring the ReplicationStateTransitionLock for shutdown"}
      12/1/2020 10:17:57 PM\{"t":{"$date":"2020-12-02T06:17:57.643+00:00"},"s":"I", "c":"REPL", "id":4784917, "ctx":"initandlisten","msg":"Attempting to mark clean shutdown"}
      12/1/2020 10:17:57 PM\{"t":{"$date":"2020-12-02T06:17:57.643+00:00"},"s":"I", "c":"NETWORK", "id":4784918, "ctx":"initandlisten","msg":"Shutting down the ReplicaSetMonitor"}
      12/1/2020 10:17:57 PM\{"t":{"$date":"2020-12-02T06:17:57.644+00:00"},"s":"I", "c":"SHARDING", "id":4784921, "ctx":"initandlisten","msg":"Shutting down the MigrationUtilExecutor"}
      12/1/2020 10:17:57 PM\{"t":{"$date":"2020-12-02T06:17:57.644+00:00"},"s":"I", "c":"CONTROL", "id":4784925, "ctx":"initandlisten","msg":"Shutting down free monitoring"}
      12/1/2020 10:17:57 PM\{"t":{"$date":"2020-12-02T06:17:57.644+00:00"},"s":"I", "c":"FTDC", "id":4784926, "ctx":"initandlisten","msg":"Shutting down full-time data capture"}
      12/1/2020 10:17:57 PM\{"t":{"$date":"2020-12-02T06:17:57.644+00:00"},"s":"I", "c":"STORAGE", "id":4784927, "ctx":"initandlisten","msg":"Shutting down the HealthLog"}
      12/1/2020 10:17:57 PM\{"t":{"$date":"2020-12-02T06:17:57.644+00:00"},"s":"I", "c":"STORAGE", "id":4784929, "ctx":"initandlisten","msg":"Acquiring the global lock for shutdown"}
      12/1/2020 10:17:57 PM\{"t":{"$date":"2020-12-02T06:17:57.644+00:00"},"s":"I", "c":"STORAGE", "id":4784930, "ctx":"initandlisten","msg":"Shutting down the storage engine"}
      12/1/2020 10:17:57 PM\{"t":{"$date":"2020-12-02T06:17:57.644+00:00"},"s":"I", "c":"STORAGE", "id":20282, "ctx":"initandlisten","msg":"Deregistering all the collections"}
      12/1/2020 10:17:57 PM\{"t":{"$date":"2020-12-02T06:17:57.644+00:00"},"s":"I", "c":"STORAGE", "id":22261, "ctx":"initandlisten","msg":"Timestamp monitor shutting down"}
      12/1/2020 10:17:57 PM\{"t":{"$date":"2020-12-02T06:17:57.644+00:00"},"s":"I", "c":"STORAGE", "id":22317, "ctx":"initandlisten","msg":"WiredTigerKVEngine shutting down"}
      12/1/2020 10:17:57 PM\{"t":{"$date":"2020-12-02T06:17:57.644+00:00"},"s":"I", "c":"STORAGE", "id":22318, "ctx":"initandlisten","msg":"Shutting down session sweeper thread"}
      12/1/2020 10:17:57 PM\{"t":{"$date":"2020-12-02T06:17:57.644+00:00"},"s":"I", "c":"STORAGE", "id":22319, "ctx":"initandlisten","msg":"Finished shutting down session sweeper thread"}
      12/1/2020 10:17:57 PM\{"t":{"$date":"2020-12-02T06:17:57.644+00:00"},"s":"I", "c":"STORAGE", "id":22320, "ctx":"initandlisten","msg":"Shutting down journal flusher thread"}
      12/1/2020 10:17:57 PM\{"t":{"$date":"2020-12-02T06:17:57.644+00:00"},"s":"I", "c":"STORAGE", "id":22321, "ctx":"initandlisten","msg":"Finished shutting down journal flusher thread"}
      12/1/2020 10:17:57 PM\{"t":{"$date":"2020-12-02T06:17:57.644+00:00"},"s":"I", "c":"STORAGE", "id":22322, "ctx":"initandlisten","msg":"Shutting down checkpoint thread"}
      12/1/2020 10:17:57 PM\{"t":{"$date":"2020-12-02T06:17:57.644+00:00"},"s":"I", "c":"STORAGE", "id":22323, "ctx":"initandlisten","msg":"Finished shutting down checkpoint thread"}
      12/1/2020 10:17:57 PM\{"t":{"$date":"2020-12-02T06:17:57.645+00:00"},"s":"I", "c":"STORAGE", "id":4795902, "ctx":"initandlisten","msg":"Closing WiredTiger","attr":\{"closeConfig":"leak_memory=true,"}}
      12/1/2020 10:17:58 PM\{"t":{"$date":"2020-12-02T06:17:58.975+00:00"},"s":"E", "c":"STORAGE", "id":22435, "ctx":"thread1","msg":"WiredTiger error","attr":\{"error":95,"message":"[1606889878:975090][1:0x7f5b54ca6700], log-server: __posix_std_fallocate, 58: /data/db/journal/WiredTigerTmplog.0000000003: fallocate:: Operation not supported"}}
      12/1/2020 10:17:58 PM\{"t":{"$date":"2020-12-02T06:17:58.975+00:00"},"s":"E", "c":"STORAGE", "id":22435, "ctx":"thread1","msg":"WiredTiger error","attr":\{"error":95,"message":"[1606889878:975224][1:0x7f5b54ca6700], log-server: __posix_sys_fallocate, 75: /data/db/journal/WiredTigerTmplog.0000000003: fallocate:: Operation not supported"}}
      12/1/2020 10:18:00 PM\{"t":{"$date":"2020-12-02T06:18:00.521+00:00"},"s":"I", "c":"STORAGE", "id":4795901, "ctx":"initandlisten","msg":"WiredTiger closed","attr":\{"durationMillis":2876}}
      12/1/2020 10:18:00 PM\{"t":{"$date":"2020-12-02T06:18:00.521+00:00"},"s":"I", "c":"STORAGE", "id":22279, "ctx":"initandlisten","msg":"shutdown: removing fs lock..."}
      12/1/2020 10:18:00 PM\{"t":{"$date":"2020-12-02T06:18:00.523+00:00"},"s":"I", "c":"-", "id":4784931, "ctx":"initandlisten","msg":"Dropping the scope cache for shutdown"}
      12/1/2020 10:18:00 PM\{"t":{"$date":"2020-12-02T06:18:00.523+00:00"},"s":"I", "c":"CONTROL", "id":20565, "ctx":"initandlisten","msg":"Now exiting"}
      12/1/2020 10:18:00 PM\{"t":{"$date":"2020-12-02T06:18:00.523+00:00"},"s":"I", "c":"CONTROL", "id":23138, "ctx":"initandlisten","msg":"Shutting down","attr":\{"exitCode":62}}
      12/1/2020 10:18:02 PM\{"t":{"$date":"2020-12-02T06:18:02.049+00:00"},"s":"I", "c":"CONTROL", "id":23285, "ctx":"main","msg":"Automatically disabling TLS 1.0, to force-enable TLS 1.0 specify --sslDisabledProtocols 'none'"}
      12/1/2020 10:18:02 PM\{"t":{"$date":"2020-12-02T06:18:02.054+00:00"},"s":"W", "c":"ASIO", "id":22601, "ctx":"main","msg":"No TransportLayer configured during NetworkInterface startup"}
      12/1/2020 10:18:02 PM\{"t":{"$date":"2020-12-02T06:18:02.054+00:00"},"s":"I", "c":"NETWORK", "id":4648601, "ctx":"main","msg":"Implicit TCP FastOpen unavailable. If TCP FastOpen is required, set tcpFastOpenServer, tcpFastOpenClient, and tcpFastOpenQueueSize."}
      12/1/2020 10:18:02 PM\{"t":{"$date":"2020-12-02T06:18:02.055+00:00"},"s":"I", &qu...
      

            Assignee:
            eric.sedor@mongodb.com Eric Sedor
            Reporter:
            clayton.sheets@sony.com Clayton Sheets
            Votes:
            0 Vote for this issue
            Watchers:
            2 Start watching this issue

              Created:
              Updated:
              Resolved: