[SERVER-37724] Replication recovery: fassert when starting up mongod Created: 24/Oct/18 Updated: 03/Jan/19 Resolved: 14/Nov/18 |
|
| Status: | Closed |
| Project: | Core Server |
| Component/s: | None |
| Affects Version/s: | 4.0.3 |
| Fix Version/s: | None |
| Type: | Bug | Priority: | Major - P3 |
| Reporter: | Zoltan Mark Bodor [X] | Assignee: | Danny Hatcher (Inactive) |
| Resolution: | Incomplete | Votes: | 0 |
| Labels: | None | ||
| Remaining Estimate: | Not Specified | ||
| Time Spent: | Not Specified | ||
| Original Estimate: | Not Specified | ||
| Attachments: |
|
| Operating System: | ALL |
| Participants: |
| Description |
|
It's a replica set, running on version 4.0.3 in k8s. I would like to both find out how to recover from this and actually understand how did it end up in this state in the first place. My understanding is that after any applied operation (on the primary) that change is reflected on the oplog as well. I believe these happen atomically, so client gets an OK write response if both of these succeed. However, this does not mean the data is persisted yet (nor the operations in the journal, if they are less than 128 KB #anchor 1[]). It's persisted when the next checkpoint kicks in. So when a hard shutdown occur, and there is some data in the journal buffer, that's basically lost. Which is in correspondence with the docs: 'In between write operations, while the journal records remain in the WiredTiger buffers, updates can be lost following a hard shutdown of mongod.' Going back to the log `Applied op { : Timestamp(1539695503, 1) } not found. Top of oplog is { : Timestamp(1539695466, 1) }`. Now, I'm not 100% where is the 'applied op' coming from (is it from the last checkpoint? - probably yes), but it's an operation that is newer than the top of the oplog, which means that when it was persisted, the related document in the oplog wasn't. My rough guess is that the hard shutdown occurred exactly at the point of the checkpoint, which then has succeeded only partially. I've been a user of mongo for years now, but never really had the need to dig deeper into its working. Any help is greatly appreciated!
``` ... 2018-10-24T09:08:43.547+0000 I FTDC [initandlisten] Initializing full-time diagnostic data capture with directory '/data/db/diagnostic.data' ***aborting after fassert() failure ``` 1- https://docs.mongodb.com/manual/core/journaling/#journaling-process |
| Comments |
| Comment by Joachim Aumann [ 03/Jan/19 ] |
|
Hi Daniel, Yes, my issue is also within a Kubernetes cluster. After I rolled back (5 days ago) to mongo version MongoDB 3.6 I have not seen the problem again. To your questions: What is the configuration of the replica set? – This is the configuration we use now. before we used the latest mongo docker which caused the above-mentioned error Are your journal and data directories on different disks? only /data/db is mounted to a different disk. I don't know where the journal is placed When you say "hard shutdown", what exactly happened in this situation? The pod inside the kubernetes cluster was recreated the disc was staying unchanged. This happened before multiple times without any issue Please provide the mongod logs for all the nodes in the replica set covering the timeframe before the crash so we can take a look. Unfortunately, as this was a critical moment and the system was not working anymore I have not stored the logs
|
| Comment by Danny Hatcher (Inactive) [ 02/Jan/19 ] |
|
Hello joachim, The error listed here and in that Stack Overflow post state that the newest entry in the oplog is older than the last operation applied by the storage engine which should not occur. Is your issue also within a Kubernetes cluster? If so, it may be related to how Kubernetes is configured. In order for us to troubleshoot on the MongoDB-side, please respond to the questions I posed in my previous comment and we can investigate further. Thank you, Danny |
| Comment by Joachim Aumann [ 27/Dec/18 ] |
|
I have the same issue described in this stack overvlow post https://stackoverflow.com/questions/53835384/mongodb-statefulset-on-kubernetes-is-not-working-anymore-after-kubernetes-update
|
| Comment by Danny Hatcher (Inactive) [ 26/Oct/18 ] |
|
Hello Zoltan, Your understanding of the way writes work is mostly correct. In 4.0, we did make a change to the journaling process in which it follows the oplog instead of the data files themselves. That has a chance to be what is happening here. This situation is certainly unexpected. Could you provide more context as to the configuration of your cluster and the situation preceding the shutdown?
Thanks, Danny |