[SERVER-84108] Unanticipated WiredTiger History Store File Growth Observed During Failover Created: 12/Dec/23 Updated: 07/Feb/24 |
|
| Status: | Needs Verification |
| Project: | Core Server |
| Component/s: | None |
| Affects Version/s: | None |
| Fix Version/s: | None |
| Type: | Bug | Priority: | Major - P3 |
| Reporter: | venkataramans rama | Assignee: | Backlog - Triage Team |
| Resolution: | Unresolved | Votes: | 7 |
| Labels: | None | ||
| Remaining Estimate: | Not Specified | ||
| Time Spent: | Not Specified | ||
| Original Estimate: | Not Specified | ||
| Attachments: |
|
| Assigned Teams: |
Server Triage
|
| Operating System: | ALL |
| Participants: |
| Description |
|
Problem Statement:
We are currently undergoing a migration from MongoDB version 4.4 to 5.0, and during this process, we have encountered some intriguing observations that deviate from our anticipated outcomes. Despite consulting various forums, existing MongoDB JIRA tickets, and documentation, we have been unable to pinpoint concrete reasons for these unexpected findings. We have 5 member replicaSet , 4 data bearing and 1 is ARBITER. 2 members on each data centers.
members when one of the clusters goes down, we see WiredTigetHS.wt starts growing which is stored in the dbPath . Even though we set the minSnapshotHistoryWindowInSeconds to 1s, the file growing every second. Once it touches the oplog size, both oplog and this file grows simultaneously and filling up the disk fast. It depends on the work load, but in our case, it fills quickly in 5-10 mins
Once the majority members are UP, we see that WiredTigerHS.wt back to normal (less than 5MB).
We have the following questions.
Out of 5 members, 2 members still can go down as per the mongoDB fault tolerance and we expect no impact to the running primary and secondary. But starting from 5.0 due to the HS and oplog growth, we see mongoDB cant tolerate when majority is not available. Note that we set the defaultWriteConcern to 1 (Default is changed to majority starting from 5.0).
Before Failover =========sessionmgr09============== { "ns" : "local.oplog.rs", "size" : NumberLong("5348387447"), "count" : 1280626, "avgObjSize" : 4176, "storageSize" : NumberLong("6390153216"), "freeStorageSize" : 1371807744, "capped" : true, "max" : 0, "maxSize" : NumberLong("5368709120"), total 9.7G
tmpfs 95G 36G 59G 38% /var/data/sessions.1 =========sessionmgr10============== { "ns" : "local.oplog.rs", "size" : NumberLong("5360045160"), "count" : 1283416, "avgObjSize" : 4176, "storageSize" : NumberLong("5466677248"), "freeStorageSize" : 488841216, "capped" : true, "max" : 0, "maxSize" : NumberLong("5368709120"), total 8.8G
tmpfs 95G 36G 59G 38% /var/data/sessions.1
Once the 2 members were brought down
=========sessionmgr09============== { "ns" : "local.oplog.rs", "size" : NumberLong("5362478203"), "count" : 1284154, "avgObjSize" : 4175, "storageSize" : NumberLong("6390153216"), "freeStorageSize" : 1267474432, "capped" : true, "max" : 0, "maxSize" : NumberLong("5368709120"), total 11G
tmpfs 95G 37G 58G 40% /var/data/sessions.1 =========sessionmgr10============== { "ns" : "local.oplog.rs", "size" : NumberLong("5319883881"), "count" : 1273920, "avgObjSize" : 4175, "storageSize" : NumberLong("5466677248"), "freeStorageSize" : 307101696, "capped" : true, "max" : 0, "maxSize" : NumberLong("5368709120"), total 9.7G
tmpfs 95G 37G 58G 39% /var/data/sessions.1
Within 15 mins the size grows in GB. Note that the oplog still remains same.
=========sessionmgr09============== { "ns" : "local.oplog.rs", "size" : NumberLong("5809646773"), "count" : 1391589, "avgObjSize" : 4174, "storageSize" : NumberLong("6390140928"), "freeStorageSize" : 1050791936, "capped" : true, "max" : 0, "maxSize" : NumberLong("5368709120"), total 15G
tmpfs 95G 42G 53G 45% /var/data/sessions.1 =========sessionmgr10============== { "ns" : "local.oplog.rs", "size" : NumberLong("5815976272"), "count" : 1393115, "avgObjSize" : 4174, "storageSize" : NumberLong("5466664960"), "freeStorageSize" : 70377472, "capped" : true, "max" : 0, "maxSize" : NumberLong("5368709120"), total 14G
tmpfs 95G 41G 54G 44% /var/data/sessions.1
Once the WTHS.wt reaches oplog size, oplog size also increasing on sessionmgr10.
=========sessionmgr09============== { "ns" : "local.oplog.rs", "size" : NumberLong("6104465326"), "count" : 1462085, "avgObjSize" : 4175, "storageSize" : NumberLong("6390140928"), "freeStorageSize" : 876965888, "capped" : true, "max" : 0, "maxSize" : NumberLong("5368709120"), total 16G
tmpfs 95G 42G 53G 45% /var/data/sessions.1 =========sessionmgr10============== { "ns" : "local.oplog.rs", "size" : NumberLong("6111182492"), "count" : 1463674, "avgObjSize" : 4175, "storageSize" : NumberLong("5601685504"), "freeStorageSize" : 17518592, "capped" : true, "max" : 0, "maxSize" : NumberLong("5368709120"), total 15G
tmpfs 95G 41G 54G 44% /var/data/sessions.1
Now the sessionmgr09's oplog also start growing as WTHS.wt reaches the oplog size
=========sessionmgr09============== { "ns" : "local.oplog.rs", "size" : NumberLong("6942499375"), "count" : 1662602, "avgObjSize" : 4175, "storageSize" : NumberLong("6490021888"), "freeStorageSize" : 16769024, "capped" : true, "max" : 0, "maxSize" : NumberLong("5368709120"), total 17G
tmpfs 95G 43G 52G 45% /var/data/sessions.1 =========sessionmgr10============== { "ns" : "local.oplog.rs", "size" : NumberLong("6948652674"), "count" : 1664076, "avgObjSize" : 4175, "storageSize" : NumberLong("6466686976"), "freeStorageSize" : 14090240, "capped" : true, "max" : 0, "maxSize" : NumberLong("5368709120"), total 16G
tmpfs 95G 43G 53G 45% /var/data/sessions.1
Within few mins
=========sessionmgr09===Mon Dec 11 21:16:42 UTC 2023=========== { "ns" : "local.oplog.rs", "size" : NumberLong("5315937817"), "count" : 1274037, "avgObjSize" : 4172, "storageSize" : NumberLong("9466707968"), "freeStorageSize" : NumberLong("4267544576"), "capped" : true, "max" : 0, "maxSize" : NumberLong("5368709120"), total 21G
tmpfs 95G 47G 48G 50% /var/data/sessions.1 =========sessionmgr10===Mon Dec 11 21:16:43 UTC 2023=========== { "ns" : "local.oplog.rs", "size" : NumberLong("5320907873"), "count" : 1275216, "avgObjSize" : 4172, "storageSize" : NumberLong("9433522176"), "freeStorageSize" : NumberLong("4194209792"), "capped" : true, "max" : 0, "maxSize" : NumberLong("5368709120"), total 21G
tmpfs 95G 48G 47G 51% /var/data/sessions.1
Brought back the down members, on the secondary sm10, the WTHS.wt reduced to 4MB. After Sometime the sm09 (PRIMARY) WTHS.wt also reduced to 4MB.
=========sessionmgr09===Mon Dec 11 21:22:16 UTC 2023=========== { "ns" : "local.oplog.rs", "size" : NumberLong("5366640045"), "count" : 1286026, "avgObjSize" : 4173, "storageSize" : NumberLong("9466695680"), "freeStorageSize" : NumberLong("4469882880"), "capped" : true, "max" : 0, "maxSize" : NumberLong("5368709120"), total 21G
tmpfs 95G 47G 48G 50% /var/data/sessions.1 =========sessionmgr10===Mon Dec 11 21:22:17 UTC 2023=========== { "ns" : "local.oplog.rs", "size" : NumberLong("5367983767"), "count" : 1286360, "avgObjSize" : 4173, "storageSize" : NumberLong("9433522176"), "freeStorageSize" : NumberLong("4348190720"), "capped" : true, "max" : 0, "maxSize" : NumberLong("5368709120"), total 13G
tmpfs 95G 39G 56G 42% /var/data/sessions.1
Runtime parameters /usr/bin/mongod --ipv6 --slowms 500 --storageEngine wiredTiger --wiredTigerCacheSizeGB 12 --setParameter oplogFetcherUsesExhaust=false --setParameter wiredTigerConcurrentReadTransactions=16 --setParameter wiredTigerConcurrentWriteTransactions=16 --setParameter maxIndexBuildMemoryUsageMegabytes=500 --bind_ip_all --keyFile /root/.dbkey --port 27717 --dbpath=/var/data/sessions.1/e --replSet set01e --fork --pidfilepath /var/run/sessionmgr-27717.pid --oplogSize 5120 --logpath /var/log/mongodb-27717.log --logappend --quiet |
| Comments |
| Comment by venkataramans rama [ 07/Feb/24 ] |
|
Hi Team,
I Believe this issue happens because mongo checks the window by using the stableTimestamp only. Remember incase of Majority of the members not available, the stableTimestamp/MajorityCommitted time doesnt move. Not sure if the logic to find the calculateHistoryLagFromStableTimestamp should also consider the current Timestamp instead of the stableTimestamp. I appreciate your response on this long pending issue. |
| Comment by venkataramans rama [ 24/Jan/24 ] |
|
Hi Team,
Requesting your response on this please. Its been more than a month waiting for your initial response. |
| Comment by venkataramans rama [ 16/Jan/24 ] |
|
Hi Team, Looking for your initial analysis as its blocking our migration to Mongo 5.0. Appreciate your reply on this. |
| Comment by venkataramans rama [ 20/Dec/23 ] |
|
Team, Do you have any update on this please |
| Comment by venkataramans rama [ 12/Dec/23 ] |
|
the repro is very simple Have a PSA with 5 members (3 members also do) run the workload stop 2 secondaries observe WiredTigerHS.wt and oplog increase. |