[SERVER-16796] Increase logging activity for journal recovery operations Created: 09/Jan/15  Updated: 03/Dec/18  Resolved: 25/Jan/17

Status: Closed
Project: Core Server
Component/s: Logging, WiredTiger
Affects Version/s: 2.8.0-rc4
Fix Version/s: 3.2.18, 3.4.11, 3.5.2

Type: Improvement Priority: Major - P3
Reporter: Paul Rooney Assignee: Susan LoVerso
Resolution: Done Votes: 1
Labels: None
Remaining Estimate: Not Specified
Time Spent: Not Specified
Original Estimate: Not Specified

Issue Links:
Related
related to SERVER-19854 Mongod failed to open connection, rem... Closed
is related to SERVER-31149 Enable recovery progress messages Closed
Backwards Compatibility: Fully Compatible
Sprint: Storage 2017-01-23, Storage 2017-02-13
Participants:

 Description   

It would be useful to increase the logging activity of the WT engine to mongod.log, particularly around journal recovery on startup.



 Comments   
Comment by Alexander Gorrod [ 03/Dec/18 ]

I've updated the fix versions for this ticket from 3.4.3 to 3.4.11 and 3.2.13 to 3.2.18, since there was an associated server change that was made under SERVER-31149 and released as part of 3.4.11 and 3.2.18.

Comment by Githook User [ 13/Apr/17 ]

Author:

{u'username': u'agorrod', u'name': u'Alex Gorrod', u'email': u'alexander.gorrod@mongodb.com'}

Message: Import wiredtiger: f5c08e2b5f02805b062888d45c9eca19af175f7e from branch mongodb-3.2

ref: d48181f6f4..f5c08e2b5f
for: 3.2.13

SERVER-16796 Increase logging activity for journal recovery operations
SERVER-28168 Cannot start or repair mongodb after unexpected shutdown.
SERVER-28194 Missing WiredTiger.turtle file loses data
WT-2402 Misaligned structure accesses lead to undefined behavior
WT-2439 Enhance reconciliation page layout
WT-2771 Add a statistic to track per-btree dirty cache usage
WT-2790 Fix a text case false positive in test_sweep01
WT-2833 improvement: add projections to wt dump utility
WT-2898 Improve performance of eviction-heavy workloads by dynamically controlling the number of eviction threads
WT-2909 Create automatable test verifying checkpoint integrity after errors
WT-2978 Make WiredTiger python binding pip-compatible
WT-2990 checkpoint load live_open assertion failure
WT-2994 Create documentation describing page sizes and relationships
WT-3080 Python test suite: add timestamp or elapsed time for tests
WT-3082 Python test suite: shorten default run to avoid pull request timeouts.
WT-3083 Fix a bug in wtperf config dump
WT-3086 Add transaction state information to cache stuck diagnostic information
WT-3088 bug: Don't evict a page with refs visible to readers after a split
WT-3091 Add stats to test_perf0001
WT-3092 Quiet a warning from autogen.sh
WT-3093 Padding the WT_RWLOCK structure grew the WT_PAGE structure.
WT-3097 Race on reconfigure or shutdown can lead to waiting for statistics log server
WT-3099 lint: static function declarations, non-text characters in documentation
WT-3100 test bug: format is weighted to delete, insert, then write operations.
WT-3104 Fix wtperf configs for eviction tests
WT-3105 Fix a deadlock caused by allocating eviction thread sessions dynamically
WT-3106 Add truncate support to command line wt utility
WT-3108 Also dump disk page size as part of metadata information
WT-3109 wording fix in transaction doc
WT-3110 Add more test cases for the WT command line utility
WT-3111 util_create() doesnt free memory assigned to "uri"
WT-3112 Handle list lock statistic not incremented in eviction server
WT-3113 Add a verbose mode to dump the cache when eviction is stuck
WT-3114 Avoid archiving log files immediately after recovery
WT-3115 Change the dhandle lock to a read/write lock
WT-3116 Python style testing in s_all may not execute correctly
WT-3118 Protect random-abort test against unexpectedly slow child start
WT-3120 Fix ordering problem in connection_close for filesystem loaded in an extension
WT-3121 In test suite create standard way to load extensions
WT-3126 bug: dist/s_all script has misplaced quote causing bad error reporting
WT-3127 bug: CPU yield calls don't necessarily imply memory barriers
WT-3128 wt printlog returns operation-not-supported if it doesn't find any log files
WT-3130 Ensure extensions have access to database home directory
WT-3134 Coverity scan reports 1368529 and 1368528
WT-3135 search_near() for index with custom collator
WT-3136 bug fix: WiredTiger doesn't check sprintf calls for error return
WT-3137 Hang in _log_slot_join/_log_slot_switch_internal
WT-3139 Enhance wtperf to support periodic table scans
WT-3144 bug fix: random cursor returns not-found when descending to an empty page
WT-3148 Improve eviction efficiency with many small trees
WT-3149 Change eviction to start new walks from a random place in the tree
WT-3150 Reduce impact of checkpoints on eviction server
WT-3152 Convert table lock from a spinlock to a read write lock
WT-3155 Remove WT_CONN_SERVER_RUN flag
WT-3156 Assertion in log_write fires after write failure
WT-3157 checkpoint/transaction integrity issue when writes fail.
WT-3159 Incorrect key for index containing multiple variable sized entries
WT-3161 checkpoint hang after write failure injection.
WT-3164 Ensure all relevant btree fields are reset on checkpoint error
WT-3170 Clear the eviction walk point while populating from a tree
WT-3173 Add runtime detection for s390x CRC32 hardware support
WT-3174 Coverity/lint cleanup
WT-3175 New hang in internal page split
WT-3179 test bug: clang sanitizer failure in fail_fs
WT-3180 fault injection tests should only run as "long" tests and should not create core files
WT-3182 Switch make-check to run the short test suite by default
WT-3184 Problem duplicating index cursor with custom collator
WT-3186 Fix error path and panic detection in logging loops
WT-3187 Hang on shutdown with a busy cache pool
WT-3188 Fix error handling in logging where fatal errors could lead to a hang
WT-3189 Fix a segfault in the eviction server random positioning
WT-3190 Enhance eviction thread auto-tuning algorithm
WT-3191 lint
WT-3193 Close a race between verify opening a handle and eviction visiting it
WT-3196 Race with LSM and eviction when switching chunks
WT-3199 bug: eviction assertion failure
WT-3202 wtperf report an error on in_memory=true mode : No such file or directory
WT-3203 bulk-load state changes can race
WT-3204 eviction changes cost LSM performance
WT-3206 bug: core dump on NULL page index
WT-3207 Drops with checkpoint_wait=false should not wait for checkpoints
WT-3208 test format hung with 9mb cache
WT-3211 WT_CURSOR.remove cannot always retain its position.
WT-3212 'wt dump' crashes when given table with unknown collator
WT-3213 generated test/format CONFIG invalid on next run
WT-3216 add support for clang-tidy
WT-3218 unexpected checkpoint ordering failures
WT-3224 LSM assertion failure pindex->entries == 1
WT-3225 WiredTiger won't build with clang on CentOS 7.3.1611
WT-3227 Python test suite inserts unnecessary whitespace in error output.
WT-3228 Remove with overwrite shouldn't return WT_NOTFOUND
WT-3234 Update WiredTiger build for clang 4.0.
WT-3238 Java: Cursor.compare and Cursor.equals throw Exceptions for valid return values
WT-3240 Coverity reports
WT-3243 Reorder log slot release so joins don't wait on IO
WT-3244 metadata operations failing in in-memory configurations
WT-3249 Unit test test_readonly fails as it is unable to open WiredTiger.lock
WT-3250 Incorrect statistics incremented on Windows
WT-3254 test_reconfig02 uses incorrect configuration string
WT-3262 Schema operations shouldn't wait for cache
WT-3265 Verify hits assertion in eviction when transiting handle to exclusive mode
WT-3271 Eviction tuning stuck in a loop
WT-98 Update the current cursor value without a search
Branch: v3.2
https://github.com/mongodb/mongo/commit/e5de3702c1dd8257c6289869d2cbd8b014221808

Comment by Githook User [ 13/Apr/17 ]

Author:

{u'username': u'sueloverso', u'name': u'sueloverso', u'email': u'sue@mongodb.com'}

Message: SERVER-16796 Recovery progress via verbose messages. (#3225)
Branch: mongodb-3.2
https://github.com/wiredtiger/wiredtiger/commit/5af64580f5be08d2f8900b96a83d29a3ae2cf04a

Comment by Githook User [ 02/Mar/17 ]

Author:

{u'username': u'agorrod', u'name': u'Alex Gorrod', u'email': u'alexander.gorrod@mongodb.com'}

Message: Import wiredtiger: d6659de8d742b9562d08c1ba5138be881f8e24fa from branch mongodb-3.4

ref: 8d23249433..d6659de8d7
for: 3.4.3

SERVER-16796 Increase logging activity for journal recovery operations
WT-2402 Misaligned structure accesses lead to undefined behavior
WT-2771 Add a statistic to track per-btree dirty cache usage
WT-2790 Fix a text case false positive in test_sweep01
WT-2833 improvement: add projections to wt dump utility
WT-2898 Improve performance of eviction-heavy workloads by dynamically controlling the number of eviction threads
WT-2909 Create automatable test verifying checkpoint integrity after errors
WT-2994 Create documentation describing page sizes and relationships
WT-3080 Python test suite: add timestamp or elapsed time for tests
WT-3082 Python test suite: shorten default run to avoid pull request timeouts.
WT-3083 Fix a bug in wtperf config dump
WT-3086 Add transaction state information to cache stuck diagnostic information
WT-3088 bug: Don't evict a page with refs visible to readers after a split
WT-3091 Add stats to test_perf0001
WT-3092 Quiet a warning from autogen.sh
WT-3093 Padding the WT_RWLOCK structure grew the WT_PAGE structure.
WT-3097 Race on reconfigure or shutdown can lead to waiting for statistics log server
WT-3099 lint: static function declarations, non-text characters in documentation
WT-3100 test bug: format is weighted to delete, insert, then write operations.
WT-3104 Fix wtperf configs for eviction tests
WT-3105 Fix a deadlock caused by allocating eviction thread sessions dynamically
WT-3106 Add truncate support to command line wt utility
WT-3108 Also dump disk page size as part of metadata information
WT-3109 wording fix in transaction doc
WT-3110 Add more test cases for the WT command line utility
WT-3111 util_create() doesnt free memory assigned to "uri"
WT-3112 Handle list lock statistic not incremented in eviction server
WT-3113 Add a verbose mode to dump the cache when eviction is stuck
WT-3114 Avoid archiving log files immediately after recovery
WT-3115 Change the dhandle lock to a read/write lock
WT-3116 Python style testing in s_all may not execute correctly
WT-3118 Protect random-abort test against unexpectedly slow child start
WT-3120 Fix ordering problem in connection_close for filesystem loaded in an extension
WT-3121 In test suite create standard way to load extensions
WT-3126 Fix a bug in dist/s_all script
WT-3127 bug: CPU yield calls don't necessarily imply memory barriers
WT-3128 Fix a bug where wt printlog returns operation-not-supported if it doesn't find any log files
WT-3130 Proposal to change initialization of custom filesystem
WT-3134 Coverity scan reports 1368529 and 1368528
WT-3135 search_near() for index with custom collator
WT-3137 Hang in _log_slot_join/_log_slot_switch_internal
WT-3139 Enhance wtperf to support periodic table scans
WT-3143 Fix Coverity static analysis complaint in test program
WT-3144 bug fix: random cursor returns not-found when descending to an empty page
WT-3148 Improve eviction efficiency with many small trees
WT-3149 Change eviction to start new walks from a random place in the tree
WT-3150 Reduce impact of checkpoints on eviction server
WT-3152 Convert table lock from a spinlock to a read write lock
WT-3156 Assertion in log_write fires after write failure
WT-3157 checkpoint/transaction integrity issue when writes fail.
WT-3159 Incorrect key for index containing multiple variable sized entries
WT-3161 checkpoint hang after write failure injection.
WT-3164 Ensure all relevant btree fields are reset on checkpoint error
WT-3170 Clear the eviction walk point while populating from a tree
WT-3173 Add runtime detection for s390x CRC32 hardware support
WT-3174 Coverity/lint cleanup
WT-3175 New hang in internal page split
WT-3179 test bug: clang sanitizer failure in fail_fs
WT-3180 fault injection tests should only run as "long" tests and should not create core files
WT-3184 Problem duplicating index cursor with custom collator
WT-3186 Fix error path and panic detection in logging loops
WT-3187 Hang on shutdown with a busy cache pool
WT-3188 Fix error handling in logging where fatal errors could lead to a hang
WT-3189 Fix a segfault in the eviction server random positioning
Branch: v3.4
https://github.com/mongodb/mongo/commit/086c21e2b4c87952273fde78ab8fb18f18e8fdc6

Comment by Githook User [ 02/Mar/17 ]

Author:

{u'username': u'sueloverso', u'name': u'sueloverso', u'email': u'sue@mongodb.com'}

Message: SERVER-16796 Recovery progress via verbose messages. (#3225)
Branch: mongodb-3.4
https://github.com/wiredtiger/wiredtiger/commit/5af64580f5be08d2f8900b96a83d29a3ae2cf04a

Comment by Githook User [ 25/Jan/17 ]

Author:

{u'username': u'michaelcahill', u'name': u'Michael Cahill', u'email': u'michael.cahill@mongodb.com'}

Message: SERVER-16796 Increase logging activity for WT journal recovery.
Branch: master
https://github.com/mongodb/mongo/commit/9ef439f18536ec11af418602880359438ab84c64

Comment by Githook User [ 23/Jan/17 ]

Author:

{u'username': u'daveh86', u'name': u'David Hows', u'email': u'howsdav@gmail.com'}

Message: Import wiredtiger: 48a3cbc17fa902528217287fd075c87efb44aebc from branch mongodb-3.6

ref: 8d23249433..48a3cbc17f
for: 3.5.2

SERVER-16796 Increase logging activity for journal recovery operations
WT-2 What does metadata look like?
WT-2402 Misaligned structure accesses lead to undefined behavior
WT-2771 Add a statistic to track per-btree dirty cache usage
WT-2833 improvement: add projections to wt dump utility
WT-2898 Improve performance of eviction-heavy workloads by dynamically controlling the number of eviction threads
WT-2994 Create documentation describing page sizes and relationships
WT-3080 Python test suite: add timestamp or elapsed time for tests
WT-3082 Python test suite: shorten default run to avoid pull request timeouts.
WT-3083 Fix a bug in wtperf config dump
WT-3086 Add transaction state information to cache stuck diagnostic information
WT-3091 Add stats to test_perf0001
WT-3092 Quiet a warning from autogen.sh
WT-3093 Padding the WT_RWLOCK structure grew the WT_PAGE structure.
WT-3099 lint: static function declarations, non-text characters in documentation
WT-3100 test bug: format is weighted to delete, insert, then write operations.
WT-3104 Fix wtperf configs for eviction tests
WT-3105 Fix a deadlock caused by allocating eviction thread sessions dynamically
WT-3106 Add truncate support to command line wt utility
WT-3108 Also dump disk page size as part of metadata information
WT-3109 wording fix in transaction doc
WT-3110 Add more test cases for the WT command line utility
WT-3112 Handle list lock statistic not incremented in eviction server
WT-3114 Avoid archiving log files immediately after recovery
WT-3116 Python style testing in s_all may not execute correctly
WT-3118 Protect random-abort test against unexpectedly slow child start
WT-3121 In test suite create standard way to load extensions
WT-3127 bug: CPU yield calls don't necessarily imply memory barriers
WT-3134 Coverity scan reports 1368529 and 1368528
Branch: master
https://github.com/mongodb/mongo/commit/c91b93d2786342505fd9e151c8aa6b68ee03a1fb

Comment by Michael Cahill (Inactive) [ 04/Jan/17 ]

sue.loverso, can you please push the MongoDB side of this change into code review as well?

For testing, you will need to create a patch build that includes the latest WiredTiger develop branch.

Thanks!

Comment by Githook User [ 04/Jan/17 ]

Author:

{u'username': u'sueloverso', u'name': u'sueloverso', u'email': u'sue@mongodb.com'}

Message: SERVER-16796 Recovery progress via verbose messages. (#3225)
Branch: develop
https://github.com/wiredtiger/wiredtiger/commit/5af64580f5be08d2f8900b96a83d29a3ae2cf04a

Comment by Jonathan Abrahams [ 11/Aug/15 ]

I had a recovery scenario (SERVER-19854) which took over 18 minutes without any log entries. The first time that happened I killed the mongod because I thought it was hung. This is an important issue to address.

Generated at Thu Feb 08 03:42:17 UTC 2024 using Jira 9.7.1#970001-sha1:2222b88b221c4928ef0de3161136cc90c8356a66.