[SERVER-40591] Do not log WiredTiger EINVAL errors at the error level Created: 11/Apr/19  Updated: 27/Oct/23  Resolved: 10/Jun/19

Status: Closed
Project: Core Server
Component/s: Logging, Storage
Affects Version/s: None
Fix Version/s: None

Type: Improvement Priority: Major - P3
Reporter: Daniel Gottlieb (Inactive) Assignee: Louis Williams
Resolution: Gone away Votes: 0
Labels: None
Remaining Estimate: Not Specified
Time Spent: Not Specified
Original Estimate: Not Specified

Issue Links:
Backports
Depends
depends on WT-4844 Only log an informational message whe... Closed
Duplicate
is duplicated by SERVER-38871 suppress or alter E error message whe... Closed
Backport Requested:
v4.2
Sprint: Execution Team 2019-06-03, Execution Team 2019-06-17
Participants:
Linked BF Score: 0

 Description   

Storage code specially handles "snapshot too old" because it's an expected error that queries can race with. In those cases, WT will also log a message.

However, that message is logged by MongoDB as an error which is not ideal.



 Comments   
Comment by Louis Williams [ 10/Jun/19 ]

Closing because WT-4844 eliminates error logging of these messages.

Comment by Alexander Gorrod [ 03/Jun/19 ]

keith.bostic Could you please do a review of the proposed change here?

Comment by David Daly [ 30/May/19 ]

That would make the error go away. If that is a reasonable change, then please go for it. 

Comment by Louis Williams [ 30/May/19 ]

david.daly Would lowering the log level to WARN fix the performance test problems?

Comment by Daniel Gottlieb (Inactive) [ 17/Apr/19 ]

That's a good, but complicated question. We're not entirely sure right now what the right thing is to do with this problem. But I do think we can lower-bound an estimation based on the following (I believe agreed upon) assumptions:

  • It's expected for code in src/mongo/db/... to call into src/third_party/wiredtiger/... with a read timestamp that is no longer valid. MongoDB gracefully handles the error that's programatically returned.
  • WiredTiger will always log an error in this scenario. Changing that behavior will require waiting for WT work to be scheduled, pushed and dropped.
  • A MongoDB solution that guarantees it never races calls into WT that could invalidate an incoming read timestamp, but there's risk it would be prohibitively expensive, more error prone and would likely be backed out when a more cooperative solution that requires some WT change becomes available.

If you need something sooner than ~2 weeks (the estimate to get WT work to happen and dropped) to stem the false positives for the perf test failures, I'd recommend exploring a solution that omits failing on that particular log message.

Comment by David Bradford (Inactive) [ 17/Apr/19 ]

We are seeing this log message really frequently in our performance tests, which is causing them to be marked as failures. Is there any idea of when this work might get scheduled?

Generated at Thu Feb 08 04:55:26 UTC 2024 using Jira 9.7.1#970001-sha1:2222b88b221c4928ef0de3161136cc90c8356a66.