|
Hi iravanchi,
It sounds like you may be experiencing a different problem related to permissions on your volume. Can you please check the permissions for the user under which you run mongod to ensure that user has full access to the mongod data directory? If permissions are correct can you please open a new ticket, and when you do
- Mention this ticket for reference
- Attach the complete log files for the node in question. (If you are concerned about sharing log files in the public project we can provide a private secure upload portal specifically for your issue; just mention that in the new ticket.)
- Attach a detailed recursive directory listing of the mongod data directory.
- Give us a little more detail about the timeline and the specific problem you are seeing - is it just the error message in the log file, or is there data loss or other symptom that impacts your application? If so please provide the timeline of the symptoms you observed (including timezone).
Thanks,
Bruce
|
|
I'm observing this issue on 3.4.1, and I'm also seeing a line in the log which I thought might be related (rename issue)
Uncaught exception in 'FileRenameFailed: Access is denied' in full-time diagnostic data capture subsystem. Shutting down the full-time diagnostic data capture subsystem.
I got this after a few minutes of re-initializing (removing data files) the instance and restarting it, early during synchronization to another RS member (the node was in STARTUP2 mode)
|
|
Author:
{u'username': u'agorrod', u'name': u'Alex Gorrod', u'email': u'alexander.gorrod@mongodb.com'}
Message: Import wiredtiger: f5c08e2b5f02805b062888d45c9eca19af175f7e from branch mongodb-3.2
ref: d48181f6f4..f5c08e2b5f
for: 3.2.13
SERVER-16796 Increase logging activity for journal recovery operations
SERVER-28168 Cannot start or repair mongodb after unexpected shutdown.
SERVER-28194 Missing WiredTiger.turtle file loses data
WT-2402 Misaligned structure accesses lead to undefined behavior
WT-2439 Enhance reconciliation page layout
WT-2771 Add a statistic to track per-btree dirty cache usage
WT-2790 Fix a text case false positive in test_sweep01
WT-2833 improvement: add projections to wt dump utility
WT-2898 Improve performance of eviction-heavy workloads by dynamically controlling the number of eviction threads
WT-2909 Create automatable test verifying checkpoint integrity after errors
WT-2978 Make WiredTiger python binding pip-compatible
WT-2990 checkpoint load live_open assertion failure
WT-2994 Create documentation describing page sizes and relationships
WT-3080 Python test suite: add timestamp or elapsed time for tests
WT-3082 Python test suite: shorten default run to avoid pull request timeouts.
WT-3083 Fix a bug in wtperf config dump
WT-3086 Add transaction state information to cache stuck diagnostic information
WT-3088 bug: Don't evict a page with refs visible to readers after a split
WT-3091 Add stats to test_perf0001
WT-3092 Quiet a warning from autogen.sh
WT-3093 Padding the WT_RWLOCK structure grew the WT_PAGE structure.
WT-3097 Race on reconfigure or shutdown can lead to waiting for statistics log server
WT-3099 lint: static function declarations, non-text characters in documentation
WT-3100 test bug: format is weighted to delete, insert, then write operations.
WT-3104 Fix wtperf configs for eviction tests
WT-3105 Fix a deadlock caused by allocating eviction thread sessions dynamically
WT-3106 Add truncate support to command line wt utility
WT-3108 Also dump disk page size as part of metadata information
WT-3109 wording fix in transaction doc
WT-3110 Add more test cases for the WT command line utility
WT-3111 util_create() doesnt free memory assigned to "uri"
WT-3112 Handle list lock statistic not incremented in eviction server
WT-3113 Add a verbose mode to dump the cache when eviction is stuck
WT-3114 Avoid archiving log files immediately after recovery
WT-3115 Change the dhandle lock to a read/write lock
WT-3116 Python style testing in s_all may not execute correctly
WT-3118 Protect random-abort test against unexpectedly slow child start
WT-3120 Fix ordering problem in connection_close for filesystem loaded in an extension
WT-3121 In test suite create standard way to load extensions
WT-3126 bug: dist/s_all script has misplaced quote causing bad error reporting
WT-3127 bug: CPU yield calls don't necessarily imply memory barriers
WT-3128 wt printlog returns operation-not-supported if it doesn't find any log files
WT-3130 Ensure extensions have access to database home directory
WT-3134 Coverity scan reports 1368529 and 1368528
WT-3135 search_near() for index with custom collator
WT-3136 bug fix: WiredTiger doesn't check sprintf calls for error return
WT-3137 Hang in _log_slot_join/_log_slot_switch_internal
WT-3139 Enhance wtperf to support periodic table scans
WT-3144 bug fix: random cursor returns not-found when descending to an empty page
WT-3148 Improve eviction efficiency with many small trees
WT-3149 Change eviction to start new walks from a random place in the tree
WT-3150 Reduce impact of checkpoints on eviction server
WT-3152 Convert table lock from a spinlock to a read write lock
WT-3155 Remove WT_CONN_SERVER_RUN flag
WT-3156 Assertion in log_write fires after write failure
WT-3157 checkpoint/transaction integrity issue when writes fail.
WT-3159 Incorrect key for index containing multiple variable sized entries
WT-3161 checkpoint hang after write failure injection.
WT-3164 Ensure all relevant btree fields are reset on checkpoint error
WT-3170 Clear the eviction walk point while populating from a tree
WT-3173 Add runtime detection for s390x CRC32 hardware support
WT-3174 Coverity/lint cleanup
WT-3175 New hang in internal page split
WT-3179 test bug: clang sanitizer failure in fail_fs
WT-3180 fault injection tests should only run as "long" tests and should not create core files
WT-3182 Switch make-check to run the short test suite by default
WT-3184 Problem duplicating index cursor with custom collator
WT-3186 Fix error path and panic detection in logging loops
WT-3187 Hang on shutdown with a busy cache pool
WT-3188 Fix error handling in logging where fatal errors could lead to a hang
WT-3189 Fix a segfault in the eviction server random positioning
WT-3190 Enhance eviction thread auto-tuning algorithm
WT-3191 lint
WT-3193 Close a race between verify opening a handle and eviction visiting it
WT-3196 Race with LSM and eviction when switching chunks
WT-3199 bug: eviction assertion failure
WT-3202 wtperf report an error on in_memory=true mode : No such file or directory
WT-3203 bulk-load state changes can race
WT-3204 eviction changes cost LSM performance
WT-3206 bug: core dump on NULL page index
WT-3207 Drops with checkpoint_wait=false should not wait for checkpoints
WT-3208 test format hung with 9mb cache
WT-3211 WT_CURSOR.remove cannot always retain its position.
WT-3212 'wt dump' crashes when given table with unknown collator
WT-3213 generated test/format CONFIG invalid on next run
WT-3216 add support for clang-tidy
WT-3218 unexpected checkpoint ordering failures
WT-3224 LSM assertion failure pindex->entries == 1
WT-3225 WiredTiger won't build with clang on CentOS 7.3.1611
WT-3227 Python test suite inserts unnecessary whitespace in error output.
WT-3228 Remove with overwrite shouldn't return WT_NOTFOUND
WT-3234 Update WiredTiger build for clang 4.0.
WT-3238 Java: Cursor.compare and Cursor.equals throw Exceptions for valid return values
WT-3240 Coverity reports
WT-3243 Reorder log slot release so joins don't wait on IO
WT-3244 metadata operations failing in in-memory configurations
WT-3249 Unit test test_readonly fails as it is unable to open WiredTiger.lock
WT-3250 Incorrect statistics incremented on Windows
WT-3254 test_reconfig02 uses incorrect configuration string
WT-3262 Schema operations shouldn't wait for cache
WT-3265 Verify hits assertion in eviction when transiting handle to exclusive mode
WT-3271 Eviction tuning stuck in a loop
WT-98 Update the current cursor value without a search
Branch: v3.2
https://github.com/mongodb/mongo/commit/e5de3702c1dd8257c6289869d2cbd8b014221808
|
|
Author:
{u'username': u'keithbostic', u'name': u'Keith Bostic', u'email': u'keith.bostic@mongodb.com'}
Message: SERVER-28194 Missing WiredTiger.turtle file loses data (#3337)
There's a two step process on Windows to rename files (including the turtle file), remove the original and then move the replacement into place – a DeleteFileW followed by a MoveFileW. If we crash in the middle (and in SERVER-28194, it looks like there's a weirder failure mode, where the DeleteFileW succeeded, but the file was still there), we can be left without a turtle file, which will lose all of the data in the database.
|
|
Author:
{u'username': u'keithbostic', u'name': u'Keith Bostic', u'email': u'keith.bostic@mongodb.com'}
Message: SERVER-28194 Missing WiredTiger.turtle file loses data (#3337)
There's a two step process on Windows to rename files (including the turtle file), remove the original and then move the replacement into place – a DeleteFileW followed by a MoveFileW. If we crash in the middle (and in SERVER-28194, it looks like there's a weirder failure mode, where the DeleteFileW succeeded, but the file was still there), we can be left without a turtle file, which will lose all of the data in the database.
|
|
Author:
{u'username': u'agorrod', u'name': u'Alex Gorrod', u'email': u'alexander.gorrod@mongodb.com'}
Message: Import wiredtiger: cb16839cfbdf338af95bed43ca40979ae6e32f54 from branch mongodb-3.4
ref: cc2f15f595..cb16839cfb
for: 3.4.4
SERVER-28168 Cannot start or repair mongodb after unexpected shutdown.
SERVER-28194 Missing WiredTiger.turtle file loses data
WT-2439 Enhance reconciliation page layout
WT-2978 Make WiredTiger python binding pip-compatible
WT-2990 checkpoint load live_open assertion failure
WT-3136 bug fix: WiredTiger doesn't check sprintf calls for error return
WT-3155 Remove WT_CONN_SERVER_RUN flag
WT-3182 Switch make-check to run the short test suite by default
WT-3190 Enhance eviction thread auto-tuning algorithm
WT-3191 lint
WT-3193 Close a race between verify opening a handle and eviction visiting it
WT-3196 Race with LSM and eviction when switching chunks
WT-3199 bug: eviction assertion failure
WT-3202 wtperf report an error on in_memory=true mode : No such file or directory
WT-3203 bulk-load state changes can race
WT-3204 eviction changes cost LSM performance
WT-3206 bug: core dump on NULL page index
WT-3207 Drops with checkpoint_wait=false should not wait for checkpoints
WT-3208 test format hung with 9mb cache
WT-3211 WT_CURSOR.remove cannot always retain its position.
WT-3212 'wt dump' crashes when given table with unknown collator
WT-3213 generated test/format CONFIG invalid on next run
WT-3216 add support for clang-tidy
WT-3218 unexpected checkpoint ordering failures
WT-3224 LSM assertion failure pindex->entries == 1
WT-3225 WiredTiger won't build with clang on CentOS 7.3.1611
WT-3227 Python test suite inserts unnecessary whitespace in error output.
WT-3228 Remove with overwrite shouldn't return WT_NOTFOUND
WT-3234 Update WiredTiger build for clang 4.0.
WT-3238 Java: Cursor.compare and Cursor.equals throw Exceptions for valid return values
WT-3240 Coverity reports
WT-3243 Reorder log slot release so joins don't wait on IO
WT-3244 metadata operations failing in in-memory configurations
WT-3249 Unit test test_readonly fails as it is unable to open WiredTiger.lock
WT-3250 Incorrect statistics incremented on Windows
WT-3254 test_reconfig02 uses incorrect configuration string
WT-3262 Schema operations shouldn't wait for cache
WT-3265 Verify hits assertion in eviction when transiting handle to exclusive mode
WT-3271 Eviction tuning stuck in a loop
WT-98 Update the current cursor value without a search
Branch: v3.4
https://github.com/mongodb/mongo/commit/9c2e3c5396adb6bbaaf6a19e6c017b051f943ebf
|
|
Author:
{u'username': u'keithbostic', u'name': u'Keith Bostic', u'email': u'keith.bostic@mongodb.com'}
Message: SERVER-28194 Missing WiredTiger.turtle file loses data (#3337)
There's a two step process on Windows to rename files (including the turtle file), remove the original and then move the replacement into place – a DeleteFileW followed by a MoveFileW. If we crash in the middle (and in SERVER-28194, it looks like there's a weirder failure mode, where the DeleteFileW succeeded, but the file was still there), we can be left without a turtle file, which will lose all of the data in the database.
|
|
Author:
{u'username': u'keithbostic', u'name': u'Keith Bostic', u'email': u'keith.bostic@mongodb.com'}
Message: SERVER-28194 Missing WiredTiger.turtle file loses data (#3337)
There's a two step process on Windows to rename files (including the turtle file), remove the original and then move the replacement into place – a DeleteFileW followed by a MoveFileW. If we crash in the middle (and in SERVER-28194, it looks like there's a weirder failure mode, where the DeleteFileW succeeded, but the file was still there), we can be left without a turtle file, which will lose all of the data in the database.
|
|
Author:
{u'username': u'agorrod', u'name': u'Alex Gorrod', u'email': u'alexander.gorrod@mongodb.com'}
Message: Import wiredtiger: af735d14a603a6ef6256a6685f09ec13755a5024 from branch mongodb-3.6
ref: cc2f15f595..af735d14a6
for: 3.5.6
SERVER-28168 Cannot start or repair mongodb after unexpected shutdown.
SERVER-28194 Missing WiredTiger.turtle file loses data
WT-2439 Enhance reconciliation page layout
WT-2978 Make WiredTiger python binding pip-compatible
WT-2990 Fix a new bug where checkpoint load live_open failed
WT-3136 bug fix: WiredTiger doesn't check sprintf calls for error return
WT-3155 Remove WT_CONN_SERVER_RUN flag
WT-3182 Switch make-check to run the short test suite by default
WT-3190 Enhance eviction thread auto-tuning algorithm
WT-3191 Fix lint complaints
WT-3193 Close a race between verify opening a handle and eviction visiting it
WT-3196 Race with LSM and eviction when switching chunks
WT-3199 bug: eviction assertion failure
WT-3202 wtperf report an error on in_memory=true mode : No such file or directory
WT-3203 bulk-load state changes can race
WT-3204 eviction changes cost LSM performance
WT-3206 bug: core dump on NULL page index
WT-3207 Drops with checkpoint_wait=false should not wait for checkpoints
WT-3208 test format hung with 9mb cache
WT-3211 WT_CURSOR.remove cannot always retain its position.
WT-3212 'wt dump' crashes when given table with unknown collator
WT-3213 generated test/format CONFIG invalid on next run
WT-3216 add support for clang-tidy
WT-3218 unexpected checkpoint ordering failures
WT-3224 LSM assertion failure pindex->entries == 1
WT-3225 WiredTiger won't build with clang on CentOS 7.3.1611
WT-3227 Python test suite inserts unnecessary whitespace in error output.
WT-3228 Remove with overwrite shouldn't return WT_NOTFOUND
WT-3234 Update WiredTiger build for clang 4.0.
WT-3238 Java: Cursor.compare and Cursor.equals throw Exceptions for valid return values
WT-3240 Coverity reports
WT-3243 Reorder log slot release so joins don't wait on IO
WT-3244 Metadata operations failing in in-memory configurations when the cache is full
WT-98 Update the current cursor value without a search
Branch: master
https://github.com/mongodb/mongo/commit/f6cbdfb8c5c52209f58562ccbe14013c72df3f40
|
It means that there's still a chance to lose data for the same issue?
We do not believe there's any chance to lose data for the same issue. We're keeping the window of vulnerability as short as possible as a defensive measure, just in case we're wrong!
|
so the window of vulnerability is as short as possible
It means that there's still a chance to lose data for the same issue?
|
|
The fix for this issue has been merged into WiredTiger's develop branch, it will be in the next development release of MongoDB.
|
|
Author:
{u'username': u'keithbostic', u'name': u'Keith Bostic', u'email': u'keith.bostic@mongodb.com'}
Message: SERVER-28194 Missing WiredTiger.turtle file loses data (#3337)
There's a two step process on Windows to rename files (including the turtle file), remove the original and then move the replacement into place – a DeleteFileW followed by a MoveFileW. If we crash in the middle (and in SERVER-28194, it looks like there's a weirder failure mode, where the DeleteFileW succeeded, but the file was still there), we can be left without a turtle file, which will lose all of the data in the database.
|
|
Author:
{u'username': u'keithbostic', u'name': u'Keith Bostic', u'email': u'keith.bostic@mongodb.com'}
Message: SERVER-28194 Missing WiredTiger.turtle file loses data (#3337)
There's a two step process on Windows to rename files (including the turtle file), remove the original and then move the replacement into place – a DeleteFileW followed by a MoveFileW. If we crash in the middle (and in SERVER-28194, it looks like there's a weirder failure mode, where the DeleteFileW succeeded, but the file was still there), we can be left without a turtle file, which will lose all of the data in the database.
|
|
Thanks Mark,
I've uploaded two files, the original file were zipped with 7z (mongo runs on windows), then to avoid the 5 GB limit, I've splitted in two files with tar. The end result is a tar subdivided in two file that contains a 7zip file with data file and the full log file of mongo (maybe the log file could be of some help)
I hope that this is ok.
Thanks a lot for the support.
Gian Maria.
|
|
Hello alkampfer,
I've generated a secure upload portal for you to send us the data. Note that there is a 5GB file size limit, however if your data is greater than 5GB this limitation is easy to work around:
split -d -b 5300000000 filename.tgz part.
|
This will produce a series of part.XX where XX is a number; you can then upload these files via the secure portal and we'll stitch them back together.
Please note that while we will attempt to restore the data, there is no guarantee that it will be successful.
Thanks,
Mark
|
|
Hi Mark,
The database was not restored from a backup for what I know, but if it was restored, 100% it was done with a mongodump then mongorestore process, we never manipulate data directory directly and we never saved data directory for backup, but always use mongodump. There are other people that could access that server, but I strongly believe that they never touched anything.
files backup is 5 GB size approx, if I need to upload to you have you some secure way for doing it, or I can give you privately FTP or some other mechanism to transfer the file?
Also, for future reference, is there any tool that can read RAW content of collection .wt files extracting documents? For what I saw, wt.exe utility can dump data only if the WiredTiger.wt files is ok.
Thanks a lot for the help.
Gian Maria
|
|
Hello alkampfer,
Thank you for the report. Looking over the output you've provided, there may be a few causes of the behavior you're seeing. To better investigate the root of the issue, there are a couple things I'd like to clarify:
- Was the database restored from a backup at any point?
- If it was, what was the method involved (mongorestore, manual file copy, etc)
- Was there any manipulation of the database files directly?
With regards to the repair, we can attempt to recover the database, but first we will need to narrow down what caused the failure in the first place. Additionally, provided the cause of the issue becomes apparent, you would need to upload the entire database for us to attempt a recovery.
Thanks,
Mark
|
|
I was pretty sure I've choosed bug to report the issue but it seems that it is a new feature. I cannot move the issue to change the type, could some admin please change the type.
Thanks a lot.
|
Generated at Thu Feb 08 04:17:25 UTC 2024 using Jira 9.7.1#970001-sha1:2222b88b221c4928ef0de3161136cc90c8356a66.