[SERVER-27546] Wiredtiger checkpoint is placed too often Created: 30/Dec/16  Updated: 27/Oct/23  Resolved: 06/Jan/17

Status: Closed
Project: Core Server
Component/s: WiredTiger
Affects Version/s: 3.2.11
Fix Version/s: None

Type: Bug Priority: Major - P3
Reporter: 아나 하리 Assignee: Kelsey Schubert
Resolution: Works as Designed Votes: 0
Labels: None
Remaining Estimate: Not Specified
Time Spent: Not Specified
Original Estimate: Not Specified

Attachments: PNG File PRIMARY-CheckpointRunning.png     PNG File SECONDARY-CheckpointRunning.png     File diagnostics.data.tar.gz    
Issue Links:
Duplicate
is duplicated by SERVER-28733 running a 3-member replicaset with --... Closed
Operating System: ALL
Participants:

 Description   

We have small data-set mongodb. All data is processed in memory.
Total data size is only about 10GB. User data size is only 40MB and oplog.rs is 10GB.
And wiredtiger cache size is 20GB. And all user documents are changed once in 30 seconds.
There's 50 ~ 300 queries / second of geowithin query only on primary.

The weired thing is primary's checkpoint is placed too often.
"transaction checkpoint currently running" metric is always 1 of "db.serverStatus().wiredTiger.transaction. Attached graphs are "transaction checkpoint currently running" metric's.

PRIMARY> db.serverStatus().wiredTiger.transaction
{
	"number of named snapshots created" : 0,
	"number of named snapshots dropped" : 0,
	"transaction begins" : 1466266063,
	"transaction checkpoint currently running" : 1,
	"transaction checkpoint generation" : 89630,
	"transaction checkpoint max time (msecs)" : 6884,
	"transaction checkpoint min time (msecs)" : 399,
	"transaction checkpoint most recent time (msecs)" : 2220,
	"transaction checkpoint scrub dirty target" : 0,
	"transaction checkpoint scrub time (msecs)" : 0,
	"transaction checkpoint total time (msecs)" : 163694262,
	"transaction checkpoints" : 89630,
	"transaction checkpoints skipped because database was clean" : 1,
	"transaction failures due to cache overflow" : 0,
	"transaction fsync calls for checkpoint after allocating the transaction ID" : 89629,
	"transaction fsync duration for checkpoint after allocating the transaction ID (usecs)" : 7725,
	"transaction range of IDs currently pinned" : 648,
	"transaction range of IDs currently pinned by a checkpoint" : 2295,
	"transaction range of IDs currently pinned by named snapshots" : 0,
	"transaction sync calls" : 0,
	"transactions committed" : 506988831,
	"transactions rolled back" : 958130556
}
 
SECONDARY> db.serverStatus().wiredTiger.transaction
{
	"number of named snapshots created" : 0,
	"number of named snapshots dropped" : 0,
	"transaction begins" : 786350822,
	"transaction checkpoint currently running" : 0,
	"transaction checkpoint generation" : 2574,
	"transaction checkpoint max time (msecs)" : 4007,
	"transaction checkpoint min time (msecs)" : 527,
	"transaction checkpoint most recent time (msecs)" : 3104,
	"transaction checkpoint scrub dirty target" : 0,
	"transaction checkpoint scrub time (msecs)" : 0,
	"transaction checkpoint total time (msecs)" : 8154109,
	"transaction checkpoints" : 2574,
	"transaction checkpoints skipped because database was clean" : 0,
	"transaction failures due to cache overflow" : 0,
	"transaction fsync calls for checkpoint after allocating the transaction ID" : 2574,
	"transaction fsync duration for checkpoint after allocating the transaction ID (usecs)" : 32661,
	"transaction range of IDs currently pinned" : 0,
	"transaction range of IDs currently pinned by a checkpoint" : 0,
	"transaction range of IDs currently pinned by named snapshots" : 0,
	"transaction sync calls" : 0,
	"transactions committed" : 809047207,
	"transactions rolled back" : 454837
}

And disk writes are happend every 2 seconds on Primary but not secondary.

PRIMARY
Device:         rrqm/s   wrqm/s     r/s     w/s    rkB/s    wkB/s avgrq-sz avgqu-sz   await r_await w_await  svctm  %util
sda               0.00    24.00    0.00  417.00     0.00 18656.50    89.48     0.26    0.63    0.00    0.63   0.02   0.90
sda               0.00     0.00    0.00    0.00     0.00     0.00     0.00     0.00    0.00    0.00    0.00   0.00   0.00
sda               0.00     0.00    0.00    0.00     0.00     0.00     0.00     0.00    0.00    0.00    0.00   0.00   0.00
sda               0.00    23.00    0.00  414.00     0.00 18783.00    90.74     0.36    0.87    0.00    0.87   0.02   0.90
sda               0.00     0.00    0.00    0.00     0.00     0.00     0.00     0.00    0.00    0.00    0.00   0.00   0.00
sda               0.00    23.00    0.00  408.00     0.00 18630.00    91.32     0.30    0.74    0.00    0.74   0.02   0.70
sda               0.00     0.00    0.00    0.00     0.00     0.00     0.00     0.00    0.00    0.00    0.00   0.00   0.00
sda               0.00    19.00    0.00  415.00     0.00 19192.50    92.49     0.31    0.75    0.00    0.75   0.02   0.80
sda               0.00     0.00    0.00    0.00     0.00     0.00     0.00     0.00    0.00    0.00    0.00   0.00   0.00
sda               0.00    28.00    0.00  399.00     0.00 18652.50    93.50     0.30    0.74    0.00    0.74   0.03   1.00
sda               0.00     0.00    0.00    0.00     0.00     0.00     0.00     0.00    0.00    0.00    0.00   0.00   0.00
sda               0.00     0.00    0.00    0.00     0.00     0.00     0.00     0.00    0.00    0.00    0.00   0.00   0.00
sda               0.00    19.00    0.00  395.00     0.00 18685.00    94.61     0.30    0.75    0.00    0.75   0.02   0.70
sda               0.00     0.00    0.00    0.00     0.00     0.00     0.00     0.00    0.00    0.00    0.00   0.00   0.00
sda               0.00    27.00    0.00  399.00     0.00 18653.00    93.50     0.32    0.79    0.00    0.79   0.02   0.70
sda               0.00     0.00    0.00    0.00     0.00     0.00     0.00     0.00    0.00    0.00    0.00   0.00   0.00
sda               0.00    20.00    0.00  395.00     0.00 18951.00    95.95     0.31    0.78    0.00    0.78   0.02   0.80
sda               0.00     1.00    0.00    7.00     0.00    39.50    11.29     0.00    0.00    0.00    0.00   0.00   0.00
sda               0.00    28.00    0.00  408.00     0.00 18655.50    91.45     0.32    0.79    0.00    0.79   0.02   0.80
sda               0.00     0.00    0.00    0.00     0.00     0.00     0.00     0.00    0.00    0.00    0.00   0.00   0.00
sda               0.00    19.00    0.00  414.00     0.00 18833.50    90.98     0.27    0.64    0.00    0.64   0.02   0.80
sda               0.00     0.00    0.00    0.00     0.00     0.00     0.00     0.00    0.00    0.00    0.00   0.00   0.00
sda               0.00     4.00    0.00   91.00     0.00  1468.00    32.26     0.03    0.29    0.00    0.29   0.01   0.10
sda               0.00    26.00    0.00  310.00     0.00 17557.00   113.27     0.25    0.81    0.00    0.81   0.02   0.70
sda               0.00     0.00    0.00    0.00     0.00     0.00     0.00     0.00    0.00    0.00    0.00   0.00   0.00
sda               0.00    18.00    0.00  400.00     0.00 18834.00    94.17     0.29    0.72    0.00    0.72   0.02   0.90
sda               0.00     0.00    0.00    0.00     0.00     0.00     0.00     0.00    0.00    0.00    0.00   0.00   0.00
sda               0.00    27.00    0.00  392.00     0.00 18665.00    95.23     0.32    0.82    0.00    0.82   0.02   0.80
sda               0.00     0.00    0.00    0.00     0.00     0.00     0.00     0.00    0.00    0.00    0.00   0.00   0.00
sda               0.00    18.00    0.00  397.00     0.00 18595.00    93.68     0.27    0.67    0.00    0.67   0.03   1.00
sda               0.00     0.00    0.00    0.00     0.00     0.00     0.00     0.00    0.00    0.00    0.00   0.00   0.00
sda               0.00    27.00    0.00  386.00     0.00 18501.00    95.86     0.30    0.77    0.00    0.77   0.02   0.80
sda               0.00     1.00    0.00    7.00     0.00    32.00     9.14     0.00    0.00    0.00    0.00   0.00   0.00
sda               0.00     0.00    0.00    0.00     0.00     0.00     0.00     0.00    0.00    0.00    0.00   0.00   0.00
sda               0.00    19.00    0.00  427.00     0.00 18889.50    88.48     0.30    0.70    0.00    0.70   0.02   0.90
sda               0.00     0.00    0.00    0.00     0.00     0.00     0.00     0.00    0.00    0.00    0.00   0.00   0.00
sda               0.00    26.00    0.00  413.00     0.00 18801.00    91.05     0.31    0.75    0.00    0.75   0.02   0.80
sda               0.00     0.00    0.00    2.00     0.00    16.00    16.00     0.00    0.00    0.00    0.00   0.00   0.00
sda               0.00    24.00    0.00  450.00     0.00 19001.00    84.45     0.38    0.84    0.00    0.84   0.02   0.90
sda               0.00     0.00    0.00    0.00     0.00     0.00     0.00     0.00    0.00    0.00    0.00   0.00   0.00
 
SECONDARY
Device:         rrqm/s   wrqm/s     r/s     w/s    rkB/s    wkB/s avgrq-sz avgqu-sz   await r_await w_await  svctm  %util
sda               0.00     0.00    0.00    0.00     0.00     0.00     0.00     0.00    0.00    0.00    0.00   0.00   0.00
sda               0.00     0.00    0.00    0.00     0.00     0.00     0.00     0.00    0.00    0.00    0.00   0.00   0.00
sda               0.00     0.00    0.00    0.00     0.00     0.00     0.00     0.00    0.00    0.00    0.00   0.00   0.00
sda               0.00     0.00    0.00    0.00     0.00     0.00     0.00     0.00    0.00    0.00    0.00   0.00   0.00
sda               0.00     0.00    0.00    4.00     0.00    28.00    14.00     0.00    0.00    0.00    0.00   0.00   0.00
sda               0.00     0.00    0.00    0.00     0.00     0.00     0.00     0.00    0.00    0.00    0.00   0.00   0.00
sda               0.00     0.00    0.00    0.00     0.00     0.00     0.00     0.00    0.00    0.00    0.00   0.00   0.00
sda               0.00     0.00    0.00    0.00     0.00     0.00     0.00     0.00    0.00    0.00    0.00   0.00   0.00
sda               0.00     0.00    0.00    0.00     0.00     0.00     0.00     0.00    0.00    0.00    0.00   0.00   0.00
sda               0.00     0.00    0.00    0.00     0.00     0.00     0.00     0.00    0.00    0.00    0.00   0.00   0.00
sda               0.00     0.00    0.00    0.00     0.00     0.00     0.00     0.00    0.00    0.00    0.00   0.00   0.00
sda               0.00     1.00    0.00    3.00     0.00    10.00     6.67     0.00    0.00    0.00    0.00   0.00   0.00
sda               0.00     0.00    0.00    0.00     0.00     0.00     0.00     0.00    0.00    0.00    0.00   0.00   0.00
sda               0.00     0.00    0.00    1.00     0.00     6.50    13.00     0.00    0.00    0.00    0.00   0.00   0.00
sda               0.00     0.00    0.00    0.00     0.00     0.00     0.00     0.00    0.00    0.00    0.00   0.00   0.00
sda               0.00     0.00    0.00    0.00     0.00     0.00     0.00     0.00    0.00    0.00    0.00   0.00   0.00
sda               0.00     0.00    0.00    0.00     0.00     0.00     0.00     0.00    0.00    0.00    0.00   0.00   0.00
sda               0.00     0.00    0.00    0.00     0.00     0.00     0.00     0.00    0.00    0.00    0.00   0.00   0.00
sda               0.00     0.00    0.00    0.00     0.00     0.00     0.00     0.00    0.00    0.00    0.00   0.00   0.00
sda               0.00     0.00    0.00    2.00     0.00     8.00     8.00     0.00    0.00    0.00    0.00   0.00   0.00
sda               0.00     0.00    0.00    0.00     0.00     0.00     0.00     0.00    0.00    0.00    0.00   0.00   0.00
sda               0.00     0.00    0.00    0.00     0.00     0.00     0.00     0.00    0.00    0.00    0.00   0.00   0.00
sda               0.00     0.00    0.00    0.00     0.00     0.00     0.00     0.00    0.00    0.00    0.00   0.00   0.00
sda               0.00     0.00    0.00    0.00     0.00     0.00     0.00     0.00    0.00    0.00    0.00   0.00   0.00
sda               0.00     0.00    0.00    0.00     0.00     0.00     0.00     0.00    0.00    0.00    0.00   0.00   0.00
sda               0.00     0.00    0.00    0.00     0.00     0.00     0.00     0.00    0.00    0.00    0.00   0.00   0.00
sda               0.00     0.00    0.00    0.00     0.00     0.00     0.00     0.00    0.00    0.00    0.00   0.00   0.00
sda               0.00     0.00    0.00    0.00     0.00     0.00     0.00     0.00    0.00    0.00    0.00   0.00   0.00
sda               0.00     0.00    0.00    0.00     0.00     0.00     0.00     0.00    0.00    0.00    0.00   0.00   0.00
sda               0.00    23.00    0.00 2396.00     0.00 61823.00    51.61     8.10    3.38    0.00    3.38   0.01   3.20
sda               0.00     0.00    0.00    2.00     0.00     5.00     5.00     0.00    0.00    0.00    0.00   0.00   0.00
sda               0.00     0.00    0.00    0.00     0.00     0.00     0.00     0.00    0.00    0.00    0.00   0.00   0.00
sda               0.00     0.00    0.00    0.00     0.00     0.00     0.00     0.00    0.00    0.00    0.00   0.00   0.00
sda               0.00     0.00    0.00    0.00     0.00     0.00     0.00     0.00    0.00    0.00    0.00   0.00   0.00
sda               0.00     0.00    0.00    1.00     0.00     4.00     8.00     0.00    0.00    0.00    0.00   0.00   0.00
sda               0.00     0.00    0.00    0.00     0.00     0.00     0.00     0.00    0.00    0.00    0.00   0.00   0.00
sda               0.00     0.00    0.00    0.00     0.00     0.00     0.00     0.00    0.00    0.00    0.00   0.00   0.00
sda               0.00     0.00    0.00    0.00     0.00     0.00     0.00     0.00    0.00    0.00    0.00   0.00   0.00



 Comments   
Comment by Kelsey Schubert [ 11/Jan/17 ]

Hi matt.lee,

It's just multiple write threads. There is no accumulated write operation count that is considered. Please take a look at the relevant source code for more details:

https://github.com/mongodb/mongo/blob/6471618952c8727bc5b06039ed2cf861e1a36436/src/mongo/db/storage/wiredtiger/wiredtiger_session_cache.h#L188
https://github.com/mongodb/mongo/blob/6471618952c8727bc5b06039ed2cf861e1a36436/src/mongo/db/storage/wiredtiger/wiredtiger_session_cache.cpp#L178
https://github.com/mongodb/mongo/blob/r3.2.11/src/mongo/db/storage/wiredtiger/wiredtiger_record_store.cpp#L1330-L1335

For the workload you're describing, I would recommend considering inMemory storage engine included in MongoDB Enterprise.

Kind regards,
Thomas

Comment by 아나 하리 [ 09/Jan/17 ]

Hi Thomas.

These data is kind of volatile. Not so much valuable. so I don't want to use resources for journal log of wiredtiger.
All data of this MongoDB is expired just after 2 minutes. (So we don't need to flush these data to disk actually).

Frequent wiredtiger checkpoint is something to do with accumulated write operation count ? or just multiple write threads ?
Is there any rule(for example, checkpoint after each 1000 write operations) to do checkpoint when wt journal is disabled ?

Regards,
Matt.

Comment by Kelsey Schubert [ 06/Jan/17 ]

Hi matt.lee,

Thanks for the providing the diagnostic.data. I see that you are running your replica set with journaling disabled. In this case, it is expected that the primary node executes a checkpoint whenever there are multiple write threads as this checkpoint ensures that writes to the replica set are durable. This behavior explains the many the checkpoints you are observing.

Would you clarify why you choose to disable journaling? I'm curious about your use case.

Thank you,
Thomas

Comment by 아나 하리 [ 30/Dec/16 ]

Hi Dan.

diagnostics.data files are already attached. look "PRIMARY-...." files in "diagnostics.data.tar.gz" on Attachmems.

Comment by Daniel Pasette (Inactive) [ 30/Dec/16 ]

Can you include the diagnostic.data files for this instance?

Generated at Thu Feb 08 04:15:28 UTC 2024 using Jira 9.7.1#970001-sha1:2222b88b221c4928ef0de3161136cc90c8356a66.