[SERVER-84927] Find out how often Atlas users run the planCacheListPlans and planCacheListQueryShapes commands Created: 01/Jul/19  Updated: 12/Jan/24  Resolved: 10/Jul/19

Status: Closed
Project: Core Server
Component/s: None
Affects Version/s: None
Fix Version/s: None

Type: Task Priority: Major - P3
Reporter: David Storch Assignee: David Storch
Resolution: Fixed Votes: 0
Labels: None
Remaining Estimate: Not Specified
Time Spent: Not Specified
Original Estimate: Not Specified

Issue Links:
Depends
Sprint: Query 2019-07-15
Participants:

 Comments   
Comment by Asya Kamsky [ 17/Jul/19 ]

Btw that host is from QualiaLabs.

This is the host that "caused" this: https://cloud.mongodb.com/v2/59029d58df9db16305479430#metrics/host/1710b058e52184a961305f51c6f9b3d0/status

Comment by Asya Kamsky [ 17/Jul/19 ]

Please note that the pings collection looks like it hasn't been updated since early May.

Comment by Daniel Pasette (Inactive) [ 10/Jul/19 ]

Thanks Dave. You should go ahead with the deprecation notice.

Comment by David Storch [ 10/Jul/19 ]

MongoDB Enterprise DataWarehouseProd-shard-0:PRIMARY> db.raw__dw__cloud__cloud_api__last_pings.aggregate([{$match: {"lastPingData.serverStatus.metrics.commands.planCacheListQueryShapes.total": {$gt: 0}}}, {$project: {total: "$lastPingData.serverStatus.metrics.commands.planCacheListQueryShapes.total", failed: "$lastPingData.serverStatus.metrics.commands.planCacheListQueryShapes.failed", hostId: 1}}, {$group: {_id: "$hostId", totalPings: {$sum: 1}, totalCommandCount: {$sum: "$total"}, maxCommandCount: {$max: "$total"}, totalFailedCount: {$sum: "$failed"}, maxFailedCount: {$max: "$failed"}}}, {$sort: {maxCommandCount: -1}}])
{ "_id" : "1710b058e52184a961305f51c6f9b3d0", "totalPings" : 13, "totalCommandCount" : 2925338, "maxCommandCount" : 225026, "totalFailedCount" : 0, "maxFailedCount" : 0 }
{ "_id" : "12510f59d1ca642c0bca82af1ce59618", "totalPings" : 39, "totalCommandCount" : 234, "maxCommandCount" : 6, "totalFailedCount" : 0, "maxFailedCount" : 0 }
{ "_id" : "274c694d81009d6f706bfaab575b6ccb", "totalPings" : 10, "totalCommandCount" : 42, "maxCommandCount" : 5, "totalFailedCount" : 0, "maxFailedCount" : 0 }
{ "_id" : "852325692afaa897bb2e986ff6417cd7", "totalPings" : 6, "totalCommandCount" : 18, "maxCommandCount" : 3, "totalFailedCount" : 0, "maxFailedCount" : 0 }
{ "_id" : "8663ab57706a83e7dd955123bb49620a", "totalPings" : 26, "totalCommandCount" : 78, "maxCommandCount" : 3, "totalFailedCount" : 0, "maxFailedCount" : 0 }
{ "_id" : "cc1a5b79903b5df6b4a9a3e3aecd102c", "totalPings" : 2, "totalCommandCount" : 5, "maxCommandCount" : 3, "totalFailedCount" : 0, "maxFailedCount" : 0 }
{ "_id" : "ce6bac0980a4cd7d0b7b5d3111e0e781", "totalPings" : 8, "totalCommandCount" : 16, "maxCommandCount" : 2, "totalFailedCount" : 0, "maxFailedCount" : 0 }
{ "_id" : "0f2b2d86453e8cb363f1fc4afc5ff27d", "totalPings" : 19, "totalCommandCount" : 38, "maxCommandCount" : 2, "totalFailedCount" : 0, "maxFailedCount" : 0 }
{ "_id" : "c60d30f122f5f99be3eaa675c6f2fe6f", "totalPings" : 44, "totalCommandCount" : 44, "maxCommandCount" : 1, "totalFailedCount" : 0, "maxFailedCount" : 0 }
{ "_id" : "9770af1b436c9e275d15c95ae4b83db4", "totalPings" : 1, "totalCommandCount" : 1, "maxCommandCount" : 1, "totalFailedCount" : 0, "maxFailedCount" : 0 }

This shows that we know of 10 hosts in the Atlas ping data which ran planCacheListQueryShapes. Nine of them only ran the command a handful of times. Note that totalCommandCount is the sum of the count across pings, so individual invocations of the command are likely double-counted. Therefore, maxCommandCount may be a better indication of how many times the command was actually run against each host. This is likely explained by DBAs or administrators manually running the command to investigate plan cache behavior. There is one host, however, against which planCacheListQueryShapes ran more than 225,000 times. See hostId 1710b058e52184a961305f51c6f9b3d0. Is it possible, or even worthwhile, to figure out what that user is doing?

Here is the same analysis for the planCacheListPlans command:

MongoDB Enterprise DataWarehouseProd-shard-0:PRIMARY> db.raw__dw__cloud__cloud_api__last_pings.aggregate([{$match: {"lastPingData.serverStatus.metrics.commands.planCacheListPlans.total": {$gt: 0}}}, {$project: {total: "$lastPingData.serverStatus.metrics.commands.planCacheListPlans.total", failed: "$lastPingData.serverStatus.metrics.commands.planCacheListPlans.failed", hostId: 1}}, {$group: {_id: "$hostId", totalPings: {$sum: 1}, totalCommandCount: {$sum: "$total"}, maxCommandCount: {$max: "$total"}, totalFailedCount: {$sum: "$failed"}, maxFailedCount: {$max: "$failed"}}}, {$sort: {maxCommandCount: -1}}])
{ "_id" : "1710b058e52184a961305f51c6f9b3d0", "totalPings" : 13, "totalCommandCount" : 2797041, "maxCommandCount" : 215157, "totalFailedCount" : 0, "maxFailedCount" : 0 }
{ "_id" : "c60d30f122f5f99be3eaa675c6f2fe6f", "totalPings" : 9, "totalCommandCount" : 54, "maxCommandCount" : 6, "totalFailedCount" : 0, "maxFailedCount" : 0 }
{ "_id" : "274c694d81009d6f706bfaab575b6ccb", "totalPings" : 10, "totalCommandCount" : 50, "maxCommandCount" : 5, "totalFailedCount" : 0, "maxFailedCount" : 0 }
{ "_id" : "9770af1b436c9e275d15c95ae4b83db4", "totalPings" : 1, "totalCommandCount" : 2, "maxCommandCount" : 2, "totalFailedCount" : 0, "maxFailedCount" : 0 }
{ "_id" : "d2288d9aa850cd94a548a9ffa6062059", "totalPings" : 38, "totalCommandCount" : 38, "maxCommandCount" : 1, "totalFailedCount" : 0, "maxFailedCount" : 0 }

The situation is similar. The same host which runs planCacheListQueryShapes a lot also runs planCacheListPlans a lot.

We only have 13 individual pings from host "1710b058e52184a961305f51c6f9b3d0". Over the course of these 13 pings, neither planCacheListQueryShapes nor planCacheListPlans are actively being executed:

MongoDB Enterprise DataWarehouseProd-shard-0:PRIMARY> db.raw__dw__cloud__cloud_api__last_pings.aggregate([{$match: {hostId: "1710b058e52184a961305f51c6f9b3d0", $or: [{"lastPingData.serverStatus.metrics.commands.planCacheListPlans.total": {$gt: 0}}, {"lastPingData.serverStatus.metrics.commands.planCacheListQueryShapes.total": {$gt: 0}}]}}, {$project: {"lastPingData.serverStatus.metrics.commands.planCacheListPlans": 1, "lastPingData.serverStatus.metrics.commands.planCacheListQueryShapes": 1, lastPingTimestamp: 1, hostId: 1, groupId: 1}}, {$sort: {lastPingTimestamp: 1}}])
{ "_id" : ObjectId("5cb0f2659750070012a2529f"), "lastPingTimestamp" : NumberLong("1554923957000"), "lastPingData" : { "serverStatus" : { "metrics" : { "commands" : { "planCacheListPlans" : { "total" : 215157, "failed" : 0 }, "planCacheListQueryShapes" : { "total" : 225026, "failed" : 0 } } } } }, "groupId" : "59029d58df9db16305479430", "hostId" : "1710b058e52184a961305f51c6f9b3d0" }
{ "_id" : ObjectId("5cb18b14a6208c0010957325"), "lastPingTimestamp" : NumberLong("1555113983000"), "lastPingData" : { "serverStatus" : { "metrics" : { "commands" : { "planCacheListPlans" : { "total" : 215157, "failed" : 0 }, "planCacheListQueryShapes" : { "total" : 225026, "failed" : 0 } } } } }, "groupId" : "59029d58df9db16305479430", "hostId" : "1710b058e52184a961305f51c6f9b3d0" }
{ "_id" : ObjectId("5cb28f9ebcb54d000e0f29b6"), "lastPingTimestamp" : NumberLong("1555200333000"), "lastPingData" : { "serverStatus" : { "metrics" : { "commands" : { "planCacheListPlans" : { "total" : 215157, "failed" : 0 }, "planCacheListQueryShapes" : { "total" : 225026, "failed" : 0 } } } } }, "groupId" : "59029d58df9db16305479430", "hostId" : "1710b058e52184a961305f51c6f9b3d0" }
{ "_id" : ObjectId("5cb415fe431645000f2c5c8a"), "lastPingTimestamp" : NumberLong("1555286738000"), "lastPingData" : { "serverStatus" : { "metrics" : { "commands" : { "planCacheListPlans" : { "total" : 215157, "failed" : 0 }, "planCacheListQueryShapes" : { "total" : 225026, "failed" : 0 } } } } }, "groupId" : "59029d58df9db16305479430", "hostId" : "1710b058e52184a961305f51c6f9b3d0" }
{ "_id" : ObjectId("5cb5e2140d6e85000ffe4660"), "lastPingTimestamp" : NumberLong("1555372844000"), "lastPingData" : { "serverStatus" : { "metrics" : { "commands" : { "planCacheListPlans" : { "total" : 215157, "failed" : 0 }, "planCacheListQueryShapes" : { "total" : 225026, "failed" : 0 } } } } }, "groupId" : "59029d58df9db16305479430", "hostId" : "1710b058e52184a961305f51c6f9b3d0" }
{ "_id" : ObjectId("5cb7021168fa2a001378f751"), "lastPingTimestamp" : NumberLong("1555459359000"), "lastPingData" : { "serverStatus" : { "metrics" : { "commands" : { "planCacheListPlans" : { "total" : 215157, "failed" : 0 }, "planCacheListQueryShapes" : { "total" : 225026, "failed" : 0 } } } } }, "groupId" : "59029d58df9db16305479430", "hostId" : "1710b058e52184a961305f51c6f9b3d0" }
{ "_id" : ObjectId("5cb87f8db0249e0010179367"), "lastPingTimestamp" : NumberLong("1555545819000"), "lastPingData" : { "serverStatus" : { "metrics" : { "commands" : { "planCacheListPlans" : { "total" : 215157, "failed" : 0 }, "planCacheListQueryShapes" : { "total" : 225026, "failed" : 0 } } } } }, "groupId" : "59029d58df9db16305479430", "hostId" : "1710b058e52184a961305f51c6f9b3d0" }
{ "_id" : ObjectId("5cb946e3793e61000fe5c429"), "lastPingTimestamp" : NumberLong("1555632280000"), "lastPingData" : { "serverStatus" : { "metrics" : { "commands" : { "planCacheListPlans" : { "total" : 215157, "failed" : 0 }, "planCacheListQueryShapes" : { "total" : 225026, "failed" : 0 } } } } }, "groupId" : "59029d58df9db16305479430", "hostId" : "1710b058e52184a961305f51c6f9b3d0" }
{ "_id" : ObjectId("5cbaccb4caaae30010ce6ddf"), "lastPingTimestamp" : NumberLong("1555718631000"), "lastPingData" : { "serverStatus" : { "metrics" : { "commands" : { "planCacheListPlans" : { "total" : 215157, "failed" : 0 }, "planCacheListQueryShapes" : { "total" : 225026, "failed" : 0 } } } } }, "groupId" : "59029d58df9db16305479430", "hostId" : "1710b058e52184a961305f51c6f9b3d0" }
{ "_id" : ObjectId("5cbc46d4c4de6a0012fce2b0"), "lastPingTimestamp" : NumberLong("1555804981000"), "lastPingData" : { "serverStatus" : { "metrics" : { "commands" : { "planCacheListPlans" : { "total" : 215157, "failed" : 0 }, "planCacheListQueryShapes" : { "total" : 225026, "failed" : 0 } } } } }, "groupId" : "59029d58df9db16305479430", "hostId" : "1710b058e52184a961305f51c6f9b3d0" }
{ "_id" : ObjectId("5cbd37c6bc8b5200104ea74e"), "lastPingTimestamp" : NumberLong("1555891441000"), "lastPingData" : { "serverStatus" : { "metrics" : { "commands" : { "planCacheListPlans" : { "total" : 215157, "failed" : 0 }, "planCacheListQueryShapes" : { "total" : 225026, "failed" : 0 } } } } }, "groupId" : "59029d58df9db16305479430", "hostId" : "1710b058e52184a961305f51c6f9b3d0" }
{ "_id" : ObjectId("5cbf327cce52da001378bcaf"), "lastPingTimestamp" : NumberLong("1555978287000"), "lastPingData" : { "serverStatus" : { "metrics" : { "commands" : { "planCacheListPlans" : { "total" : 215157, "failed" : 0 }, "planCacheListQueryShapes" : { "total" : 225026, "failed" : 0 } } } } }, "groupId" : "59029d58df9db16305479430", "hostId" : "1710b058e52184a961305f51c6f9b3d0" }
{ "_id" : ObjectId("5cbfbade9a582b000f7ff3f8"), "lastPingTimestamp" : NumberLong("1556064141000"), "lastPingData" : { "serverStatus" : { "metrics" : { "commands" : { "planCacheListPlans" : { "total" : 215157, "failed" : 0 }, "planCacheListQueryShapes" : { "total" : 225026, "failed" : 0 } } } } }, "groupId" : "59029d58df9db16305479430", "hostId" : "1710b058e52184a961305f51c6f9b3d0" }

asya pasette pawel.terlecki I believe this information is consistent with our plans to deprecate these two commands. I will proceed with putting the deprecation notice into the master branch and into 4.2, since the 4.2 code freeze for GA is looming! Please let me know if there is any further diligence you want me to do here. In the meantime, I will mark this ticket as "in review".

Comment by David Storch [ 02/Jul/19 ]

I was able to gain access to the data warehouse system through MANA, but I'm waiting on a request for a new index build. Marking as blocked on DW-1522.

Generated at Thu Feb 08 06:56:22 UTC 2024 using Jira 9.7.1#970001-sha1:2222b88b221c4928ef0de3161136cc90c8356a66.