[SERVER-25118] MongoDB gets corrupted after docker container restart (WT_NOTFOUND) Created: 16/Jul/16  Updated: 05/Jul/17  Resolved: 20/Jul/16

Status: Closed
Project: Core Server
Component/s: Storage
Affects Version/s: 3.3.9
Fix Version/s: None

Type: Bug Priority: Critical - P2
Reporter: Jivan Roquet Assignee: Unassigned
Resolution: Done Votes: 0
Labels: None
Remaining Estimate: Not Specified
Time Spent: Not Specified
Original Estimate: Not Specified

Attachments: PNG File Screen Shot 2016-07-16 at 21.13.30.png     Text File mongodb_wt_notfound.txt    
Issue Links:
Duplicate
is duplicated by SERVER-29952 E STORAGE [initandlisten] WiredTiger... Closed
Operating System: ALL
Steps To Reproduce:
  • create a docker container from mongo:3.3.9
  • make a volume pointing to /data/db in the container
  • start the container, put some data in the db
  • stop the container
  • restart it
Participants:

 Description   

Sometimes, with no particular explicit reason, MongoDB gives me this error:

[initandlisten] Detected data files in /data/db created by the 'wiredTiger' storage engine, so setting the active storage engine to 'wiredTiger'.
[initandlisten] wiredtiger_open config: create,cache_size=1G,session_max=20000,eviction=(threads_max=4),config_base=false,statistics=(fast),log=(enabled=true,archive=true,path=journal,compressor=snappy),file_manager=(close_idle_time=100000),checkpoint=(wait=60,log_size=2GB),statistics_log=(wait=0),
[initandlisten] Assertion: 28595:-31803: WT_NOTFOUND: item not found
[initandlisten] exception in initAndListen: 28595 -31803: WT_NOTFOUND: item not found, terminating
[initandlisten] dbexit:  rc: 100

After restarting the associated Docker container with `Ctrl+C` or with `docker stop`.

The container has a volume to persist `/data/db` directory on the host.

Once again, most of the times (that is 99%), stopping and restarting the container works perfectly and MongoDB has no problem with that. But sometimes, I'm not able to restart the container because of the error shown above.

When this happens, the only solution that has "worked" so far has been to empty the `data/db` directory on the host, and here we start with an empty (but working) db.

I have to say that informations about `WT_NOTFOUND` error on the internet is quite sparse, so no amount of digging MongoDB issues has solved this problem so far.



 Comments   
Comment by Ramon Fernandez Marina [ 20/Jul/16 ]

jivan, the error you're seeing indicates that there's vital files missing from /data/db, which is typical of platforms that don't support fsync() calls correctly.

I'm not familiar with Docker internals, so I don't know what happens when the container is stopped or interrupted with ^C, but unless mongodb is safely shut down and all the fsync() calls honored then this behavior is not unexpected.

Please note that the SERVER project is for reporting bugs or feature suggestions for the MongoDB server, and since the behavior you describe is not unexpected I'm going to close this ticket. For MongoDB-related support discussion please post on the mongodb-user group or Stack Overflow with the mongodb tag, where your question will reach a larger audience. A question like this involving more discussion would be best posted on the mongodb-user group. See also our Technical Support page for additional support resources.

Regards,
Ramón.

Comment by Jivan Roquet [ 16/Jul/16 ]

Some more informations after having dug into the GitHub repository of WiredTiger.

Looks like `WT_NOTFOUND` can be triggered either by:

I have to say that, since this error surfaces as the MongoDB termination output, this is quite confusing regarding to what triggered the error. Was a key not found in the Bloom filter, or was an environment variable missing?

Generated at Thu Feb 08 04:08:18 UTC 2024 using Jira 9.7.1#970001-sha1:2222b88b221c4928ef0de3161136cc90c8356a66.