[SERVER-18179] "top" reports inaccurate counts for multi-document operations Created: 22/Apr/15 Updated: 06/Dec/22 |
|
| Status: | Backlog |
| Project: | Core Server |
| Component/s: | Diagnostics |
| Affects Version/s: | 3.0.2 |
| Fix Version/s: | None |
| Type: | Bug | Priority: | Major - P3 |
| Reporter: | Adinoyi Omuya | Assignee: | Backlog - Query Execution |
| Resolution: | Unresolved | Votes: | 0 |
| Labels: | query-44-grooming | ||
| Remaining Estimate: | Not Specified | ||
| Time Spent: | Not Specified | ||
| Original Estimate: | Not Specified | ||
| Issue Links: |
|
||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||
| Assigned Teams: |
Query Execution
|
||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||
| Operating System: | ALL | ||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||
| Steps To Reproduce: | Both mongorestore and mongod are version 3.0.2 and the restore was run against a standalone. Here's some data collected during this period: mongorestore - Ran:
mongostat - Ran:
top - Ran
|
||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||
| Participants: | |||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||
| Case: | (copied to CRM) | ||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||
| Description |
|
For example, while running a bulk insert operation with several documents, simultaneously running db.runCommand("top") on the admin database indicates only 1 insert is occurring - even though a number of documents are being inserted. Monitoring applications that rely on this count are therefore unable to report accurate counts when clients perform multi-document operations. |
| Comments |
| Comment by Ian Whalen (Inactive) [ 09/Mar/18 ] | |||||||||||||||||
|
bernard.gorman to resolve this if fixed via outstanding work. | |||||||||||||||||
| Comment by Asya Kamsky [ 03/Mar/18 ] | |||||||||||||||||
|
Note that my previous statement:
is not accurate in sharded cluster. Top (correctly, it seems) only counts write commands initiated by the client. I.e. it does not count migration inserts/deletes. ServerStatus command seems to count some system operations as writes - leading to top and serverStatus counts for write commands to diverge in sharded clusters. However, there are scenarios when serverStatus is inconsistent and counts documents inserted in a single insert as multiple inserts, so it seems more likely to me that top is consistent and serverStatus is not. It appears the mongostat command benefits from this by showing documents inserted based on serverStatus opcounters but technically it seems to be relying on a mistake and probably the fix should be in making mongostat use more correct/consistent metrics (and then serverStatus maybe can be fixed, though I'm guessing other things are relying on it so the "fix" would have to be non-backwards breaking). In fact, it looks like it's db.serverStatus().metrics.commands.insert" that's consistent with top I think proving that the inconsistency is mainly in the way mongostat uses serverStatus data. | |||||||||||||||||
| Comment by Asya Kamsky [ 01/Mar/18 ] | |||||||||||||||||
|
I did a bit of research and top command seems consistent - it always increments counter once for each time a command was run. So running update command will increment the counter once whether it changed 0 documents or 100. Insert with multiple documents also increments insert command counter once. This seems consistent if maybe not what someone wants when they are comparing it to mongostat or some counters from db.serverStatus. But it's the same as serverStatus opcounters just not the same as serverStatus document counts. | |||||||||||||||||
| Comment by Daniel Pasette (Inactive) [ 20/Mar/16 ] | |||||||||||||||||
|
The integration team is already considering related issues. Adding to their backlog. | |||||||||||||||||
| Comment by Andy Schwerin [ 04/Jun/15 ] | |||||||||||||||||
|
A lot of the "top" command behavior is surprising. Some of it has probably not worked as originally intended since 2.2. I know that the lock information overcounts time spent write locked. adinoyi.omuya@10gen.com, let's sit down and talk about what the intended meaning of mongotop's output is, and adjust this ticket to account for all of the needed changes to the top command. | |||||||||||||||||
| Comment by Adinoyi Omuya [ 22/Apr/15 ] | |||||||||||||||||
|
I used both - it was with the wire protocol - specifically OP_INSERT - that I ran into this. You can easily reproduce this using mongorestore v3.0.2. The counts are accurate using the shell's bulk API though:
Before running:
| |||||||||||||||||
| Comment by Scott Hernandez (Inactive) [ 22/Apr/15 ] | |||||||||||||||||
|
Are you using write commands or the wire protocol. I believe you are seeing a by-product of how we process write commands with new internal operations for each item. | |||||||||||||||||
| Comment by Adinoyi Omuya [ 22/Apr/15 ] | |||||||||||||||||
|
When I run the serverStatus command, I get the actual count I expect - regardless of whether I use bulk inserts - which can be seen from the mongostat output I included (mongostat leverages the serverStatus command). This suggests that we actually keep track of the number of documents affected. So I'm unclear on why top behaves differently. | |||||||||||||||||
| Comment by Scott Hernandez (Inactive) [ 22/Apr/15 ] | |||||||||||||||||
|
I believe it monitors client issued operations (a single multi-document insert command), not documents being affected – just like server status opcounters. If one wanted to monitor document based counters they can use those server status metrics. |