Uploaded image for project: 'Core Server'
  1. Core Server
  2. SERVER-21619

sys-perf: WT crash during core_workloads_WT execution

    Details

    • Type: Bug
    • Status: Closed
    • Priority: Critical - P2
    • Resolution: Fixed
    • Affects Version/s: None
    • Fix Version/s: 3.2.0-rc5
    • Component/s: WiredTiger
    • Labels:
    • Backwards Compatibility:
      Fully Compatible
    • Operating System:
      ALL

      Description

      There were some random crashes of mongod during core_workloads_WT test in system-perf. observations:

      • this happens in all non-sharded setups (standalone, 1-node replSet, 3-node replSet) randomly
      • Seems that it always happens during insert_ttl test
      • There was no core file generated, I am trying to figure out why. I did apply "ulimit -c unlimited" during start up mongod.
      • Manual run of insert_ttl does not re-create the crash yet, maybe need to run the whole suite?
      • The earliest SHA with this issue is 3f598f1edc (test report)

      Stack trace from the mongod log file,

       [2015/11/21 17:50:03.017] 2015-11-21T22:46:58.450+0000 I NETWORK  [conn437] end connection 10.2.0.98:51172 (7 connections now open)
       [2015/11/21 17:50:03.017] 2015-11-21T22:46:59.372+0000 F -        [thread1] Invalid access at address: 0xc8
       [2015/11/21 17:50:03.017] 2015-11-21T22:46:59.379+0000 F -        [thread1] Got signal: 11 (Segmentation fault).
       [2015/11/21 17:50:03.017]  0x12c99f2 0x12c8929 0x12c8ca8 0x7ff85a737130 0x19f9699 0x19f9937 0x19fce8f 0x19c91ad 0x19c68a4 0x19c6b5a 0x7ff85a72fdf3 0x7ff85a45d1ad
       [2015/11/21 17:50:03.017] ----- BEGIN BACKTRACE -----
       [2015/11/21 17:50:03.017] {"backtrace":[{"b":"400000","o":"EC99F2"},{"b":"400000","o":"EC8929"},{"b":"400000","o":"EC8CA8"},{"b":"7FF85A728000","o":"F130"},{"b":"400000","o":"15F9699"},{"b":"400000","o":"15F9937"},{"b":"400000","o":"15FCE8F"},{"b":"400000","o":"15C91AD"},{"b":"400000","o":"15C68A4"},{"b":"400000","o":"15C6B5A"},{"b":"7FF85A728000","o":"7DF3"},{"b":"7FF85A367000","o":"F61AD"}],"processInfo":{ "mongodbVersion" : "3.2.0-rc3-95-g6f2a7e6", "gitVersion" : "6f2a7e6cfb69e186ee2d5ca8653dda5bf0633ef7", "compiledModules" : [], "uname" : { "sysname" : "Linux", "release" : "3.14.35-28.38.amzn1.x86_64", "version" : "#1 SMP Wed Mar 11 22:50:37 UTC 2015", "machine" : "x86_64" }, "somap" : [ { "elfType" : 2, "b" : "400000" }, { "b" : "7FFC129D4000", "elfType" : 3 }, { "b" : "7FF85B364000", "path" : "/lib64/librt.so.1", "elfType" : 3 }, { "b" : "7FF85B160000", "path" : "/lib64/libdl.so.2", "elfType" : 3 }, { "b" : "7FF85AE5C000", "path" : "/usr/lib64/libstdc++.so.6", "elfType" : 3 }, { "b" : "7FF85AB5A000", "path" : "/lib64/libm.so.6", "elfType" : 3 }, { "b" : "7FF85A944000", "path" : "/lib64/libgcc_s.so.1", "elfType" : 3 }, { "b" : "7FF85A728000", "path" : "/lib64/libpthread.so.0", "elfType" : 3 }, { "b" : "7FF85A367000", "path" : "/lib64/libc.so.6", "elfType" : 3 }, { "b" : "7FF85B56C000", "path" : "/lib64/ld-linux-x86-64.so.2", "elfType" : 3 } ] }}
       [2015/11/21 17:50:03.017]  mongod(_ZN5mongo15printStackTraceERSo+0x32) [0x12c99f2]
       [2015/11/21 17:50:03.017]  mongod(+0xEC8929) [0x12c8929]
       [2015/11/21 17:50:03.017]  mongod(+0xEC8CA8) [0x12c8ca8]
       [2015/11/21 17:50:03.017]  libpthread.so.0(+0xF130) [0x7ff85a737130]
       [2015/11/21 17:50:03.017]  mongod(+0x15F9699) [0x19f9699]
       [2015/11/21 17:50:03.017]  mongod(+0x15F9937) [0x19f9937]
       [2015/11/21 17:50:03.017]  mongod(__wt_reconcile+0x27F) [0x19fce8f]
       [2015/11/21 17:50:03.018]  mongod(__wt_evict+0x28D) [0x19c91ad]
       [2015/11/21 17:50:03.018]  mongod(+0x15C68A4) [0x19c68a4]
       [2015/11/21 17:50:03.018]  mongod(+0x15C6B5A) [0x19c6b5a]
       [2015/11/21 17:50:03.018]  libpthread.so.0(+0x7DF3) [0x7ff85a72fdf3]
       [2015/11/21 17:50:03.018]  libc.so.6(clone+0x6D) [0x7ff85a45d1ad]
       [2015/11/21 17:50:03.018] -----  END BACKTRACE  -----
      

      decode

      [ec2-user@ip-10-2-0-98 t]$ addr2line -e ./mongodb/bin/mongod 0x12c99f2 0x12c8929 0x12c8ca8 0x7ff85a737130 0x19f9699 0x19f9937 0x19fce8f 0x19c91ad 0x19c68a4 0x19c6b5a 0x7ff85a72fdf3 0x7ff85a45d1ad
      /srv/10gen/mci-exec/mci/src/src/mongo/util/stacktrace_posix.cpp:172
      /srv/10gen/mci-exec/mci/src/src/mongo/util/signal_handlers_synchronous.cpp:180
      /srv/10gen/mci-exec/mci/src/src/mongo/util/signal_handlers_synchronous.cpp:275
      ??:0
      /srv/10gen/mci-exec/mci/src/src/third_party/wiredtiger/src/reconcile/rec_write.c:1980
      /srv/10gen/mci-exec/mci/src/src/third_party/wiredtiger/src/reconcile/rec_write.c:4572
      /srv/10gen/mci-exec/mci/src/src/third_party/wiredtiger/src/reconcile/rec_write.c:412
      /srv/10gen/mci-exec/mci/src/src/third_party/wiredtiger/src/evict/evict_page.c:480
      /srv/10gen/mci-exec/mci/src/src/third_party/wiredtiger/src/evict/evict_lru.c:1467
      /srv/10gen/mci-exec/mci/src/src/third_party/wiredtiger/src/evict/evict_lru.c:818
      ??:0
      ??:0
      

      more details here
      https://evergreen.mongodb.com/task_log_raw/sys_perf_linux_1_node_replSet_core_workloads_WT_6f2a7e6cfb69e186ee2d5ca8653dda5bf0633ef7_15_11_20_23_29_14/0?type=T#L482 and link to mongod.tar.gz (https://s3.amazonaws.com/mciuploads/dsi/sys_perf_6f2a7e6cfb69e186ee2d5ca8653dda5bf0633ef7/6f2a7e6cfb69e186ee2d5ca8653dda5bf0633ef7/mongod-sys_perf_6f2a7e6cfb69e186ee2d5ca8653dda5bf0633ef7.tar.gz ) which was not stripped.

      and few other crashes:

        Issue Links

          Activity

          Hide
          xgen-internal-githook Githook User added a comment -

          Author:

          {u'username': u'michaelcahill', u'name': u'Michael Cahill', u'email': u'michael.cahill@mongodb.com'}

          Message: SERVER-21619 Revert an assertion change.
          Branch: develop
          https://github.com/wiredtiger/wiredtiger/commit/9ecf70c9c09129b71f7721754a69226e2c4b73d2

          Show
          xgen-internal-githook Githook User added a comment - Author: {u'username': u'michaelcahill', u'name': u'Michael Cahill', u'email': u'michael.cahill@mongodb.com'} Message: SERVER-21619 Revert an assertion change. Branch: develop https://github.com/wiredtiger/wiredtiger/commit/9ecf70c9c09129b71f7721754a69226e2c4b73d2
          Hide
          xgen-internal-githook Githook User added a comment -

          Author:

          {u'username': u'agorrod', u'name': u'Alex Gorrod', u'email': u'alexander.gorrod@mongodb.com'}

          Message: Merge pull request #2336 from wiredtiger/server-21619-dont-split-dead-tree

          SERVER-21619 Don't do internal page splits after a tree is marked DEAD.
          Branch: develop
          https://github.com/wiredtiger/wiredtiger/commit/890ee3447449fc72d5247035334f28c9f50bb100

          Show
          xgen-internal-githook Githook User added a comment - Author: {u'username': u'agorrod', u'name': u'Alex Gorrod', u'email': u'alexander.gorrod@mongodb.com'} Message: Merge pull request #2336 from wiredtiger/server-21619-dont-split-dead-tree SERVER-21619 Don't do internal page splits after a tree is marked DEAD. Branch: develop https://github.com/wiredtiger/wiredtiger/commit/890ee3447449fc72d5247035334f28c9f50bb100
          Hide
          xgen-internal-githook Githook User added a comment -

          Author:

          {u'username': u'michaelcahill', u'name': u'Michael Cahill', u'email': u'michael.cahill@mongodb.com'}

          Message: Import wiredtiger-wiredtiger-mongodb-3.2-rc4-41-g8326df6.tar.gz from wiredtiger branch mongodb-3.2

          ref: b65381f..8326df6

          4c49948 WT-2244 Trigger in-memory splits sooner.
          9f2e4f3 WT-2248 WT_SESSION.close is updating WT_CONNECTION_IMPL.default_session.
          a6da10e SERVER-21553 Enable fast-path truncate after splits.
          39dfd21 WT-2243 Don't keep transaction IDs pinned for reading from checkpoints.
          4e1844c WT-2230 multi-split error path.
          cace179 WT-2228 avoid unnecessary raw-compression calls.
          890ee34 SERVER-21619 Don't do internal page splits after a tree is marked DEAD.
          6c7338f WT-2241 Use a lock to protect transaction ID allocation.
          978c237 WT-2234 Coverity analysis warnings.
          Branch: master
          https://github.com/mongodb/mongo/commit/e7181b542b25981db42f74cdaee4e7fc323d3e9d

          Show
          xgen-internal-githook Githook User added a comment - Author: {u'username': u'michaelcahill', u'name': u'Michael Cahill', u'email': u'michael.cahill@mongodb.com'} Message: Import wiredtiger-wiredtiger-mongodb-3.2-rc4-41-g8326df6.tar.gz from wiredtiger branch mongodb-3.2 ref: b65381f..8326df6 4c49948 WT-2244 Trigger in-memory splits sooner. 9f2e4f3 WT-2248 WT_SESSION.close is updating WT_CONNECTION_IMPL.default_session. a6da10e SERVER-21553 Enable fast-path truncate after splits. 39dfd21 WT-2243 Don't keep transaction IDs pinned for reading from checkpoints. 4e1844c WT-2230 multi-split error path. cace179 WT-2228 avoid unnecessary raw-compression calls. 890ee34 SERVER-21619 Don't do internal page splits after a tree is marked DEAD. 6c7338f WT-2241 Use a lock to protect transaction ID allocation. 978c237 WT-2234 Coverity analysis warnings. Branch: master https://github.com/mongodb/mongo/commit/e7181b542b25981db42f74cdaee4e7fc323d3e9d
          Hide
          xgen-internal-githook Githook User added a comment -

          Author:

          {u'username': u'keithbostic', u'name': u'Keith Bostic', u'email': u'keith@wiredtiger.com'}

          Message: SERVER-22064: Coverity, function return value not checked for error

          Coverity analysis defect 77699: Unchecked return value, problem
          introduced in SERVER-21619 change, commit 354c031.

          Instead of calling __wt_evict_page_clean_update() when discarding pages,
          call __wt_ref_out() directly, __wt_evict_page_clean_update() doesn't do
          any useful additional work.

          This allows __wt_evict_page_clean_update() to be static in evict_page.c,
          rename to __evict_page_clean_update().
          Branch: develop
          https://github.com/wiredtiger/wiredtiger/commit/ff24c1f861383f015196accf825c15f1441d16af

          Show
          xgen-internal-githook Githook User added a comment - Author: {u'username': u'keithbostic', u'name': u'Keith Bostic', u'email': u'keith@wiredtiger.com'} Message: SERVER-22064 : Coverity, function return value not checked for error Coverity analysis defect 77699: Unchecked return value, problem introduced in SERVER-21619 change, commit 354c031. Instead of calling __wt_evict_page_clean_update() when discarding pages, call __wt_ref_out() directly, __wt_evict_page_clean_update() doesn't do any useful additional work. This allows __wt_evict_page_clean_update() to be static in evict_page.c, rename to __evict_page_clean_update(). Branch: develop https://github.com/wiredtiger/wiredtiger/commit/ff24c1f861383f015196accf825c15f1441d16af
          Hide
          xgen-internal-githook Githook User added a comment -

          Author:

          {u'name': u'Ramon Fernandez', u'email': u'ramon@mongodb.com'}

          Message: Import wiredtiger-wiredtiger-2.7.0-505-g7fea169.tar.gz from wiredtiger branch mongodb-3.4

          ref: 44463c5..7fea169

          WT-2355 Fix minor scratch buffer usage in logging.
          WT-2348 xargs -P isn't portable
          WT-2347 Java: schema format edge cases
          WT-2344 OS X compiler warning
          WT-2342 Enhance wtperf to support background create and drop operations
          WT-2340 Add logging guarantee assertions, whitespace
          WT-2339 format post-rebalance verify failure (stress run #11586)
          WT-2338 Disable using pre-allocated log files when backup cursor is open
          WT-2335 NULL pointer crash in config_check_search with invalid configuration string
          WT-2333 Add a flag so drop doesn't block
          WT-2332 Bug in logging write-no-sync mode
          WT-2331 Checking of search() result for reference cursors before join()
          WT-2328 schema drop does direct unlink, it should use a block manager interface.
          WT-2326 Change WTPERF to use new memory allocation functions instead of the standard
          WT-2321 WT-2321: race between eviction and worker threads on the eviction queue
          WT-2320 Only check copyright when cutting releases
          WT-2316 stress test failure: WT_CURSOR.prev out-of-order returns
          WT-2314 page-swap error handling is inconsistent
          WT-2313 sweep-server: conn_dhandle.c, 610: dhandle != conn->cache->evict_file_next
          WT-2312 re-creating a deleted column-store page can corrupt the in-memory tree
          WT-2308 custom extractor for ref_cursors in join cursor
          WT-2305 Fix coverity scan issues on 23/12/2015
          WT-2296 New log algorithm needs improving for sync/flush settings
          WT-2295 WT_SESSION.create does a full-scan of the main table
          WT-2287 WT_SESSION.rebalance
          WT-2275 broken DB after application crash
          WT-2267 Improve wtperf throttling implementation to provide steady load
          WT-2247 variable-length column-store in-memory page splits
          WT-2242 WiredTiger treats dead trees the same as other trees in eviction
          WT-2142 Connection cleanup in Python tests
          WT-2073 metadata cleanups
          WT-1801 Add a directory sync after rollback of a WT_SESSION::rename operation
          WT-1517 schema format edge cases
          SERVER-22064 Coverity analysis defect 77699: Unchecked return value
          SERVER-21619 sys-perf: WT crash during core_workloads_WT execution
          Branch: master
          https://github.com/mongodb/mongo/commit/90118b147a6943b19dc929862a11071538db1438

          Show
          xgen-internal-githook Githook User added a comment - Author: {u'name': u'Ramon Fernandez', u'email': u'ramon@mongodb.com'} Message: Import wiredtiger-wiredtiger-2.7.0-505-g7fea169.tar.gz from wiredtiger branch mongodb-3.4 ref: 44463c5..7fea169 WT-2355 Fix minor scratch buffer usage in logging. WT-2348 xargs -P isn't portable WT-2347 Java: schema format edge cases WT-2344 OS X compiler warning WT-2342 Enhance wtperf to support background create and drop operations WT-2340 Add logging guarantee assertions, whitespace WT-2339 format post-rebalance verify failure (stress run #11586) WT-2338 Disable using pre-allocated log files when backup cursor is open WT-2335 NULL pointer crash in config_check_search with invalid configuration string WT-2333 Add a flag so drop doesn't block WT-2332 Bug in logging write-no-sync mode WT-2331 Checking of search() result for reference cursors before join() WT-2328 schema drop does direct unlink, it should use a block manager interface. WT-2326 Change WTPERF to use new memory allocation functions instead of the standard WT-2321 WT-2321 : race between eviction and worker threads on the eviction queue WT-2320 Only check copyright when cutting releases WT-2316 stress test failure: WT_CURSOR.prev out-of-order returns WT-2314 page-swap error handling is inconsistent WT-2313 sweep-server: conn_dhandle.c, 610: dhandle != conn->cache->evict_file_next WT-2312 re-creating a deleted column-store page can corrupt the in-memory tree WT-2308 custom extractor for ref_cursors in join cursor WT-2305 Fix coverity scan issues on 23/12/2015 WT-2296 New log algorithm needs improving for sync/flush settings WT-2295 WT_SESSION.create does a full-scan of the main table WT-2287 WT_SESSION.rebalance WT-2275 broken DB after application crash WT-2267 Improve wtperf throttling implementation to provide steady load WT-2247 variable-length column-store in-memory page splits WT-2242 WiredTiger treats dead trees the same as other trees in eviction WT-2142 Connection cleanup in Python tests WT-2073 metadata cleanups WT-1801 Add a directory sync after rollback of a WT_SESSION::rename operation WT-1517 schema format edge cases SERVER-22064 Coverity analysis defect 77699: Unchecked return value SERVER-21619 sys-perf: WT crash during core_workloads_WT execution Branch: master https://github.com/mongodb/mongo/commit/90118b147a6943b19dc929862a11071538db1438

            People

            • Votes:
              0 Vote for this issue
              Watchers:
              12 Start watching this issue

              Dates

              • Created:
                Updated:
                Resolved: