[SERVER-7549] document level stats Created: 05/Nov/12 Updated: 06/Dec/22 |
|
| Status: | Backlog |
| Project: | Core Server |
| Component/s: | Admin, Storage |
| Affects Version/s: | 2.3.0 |
| Fix Version/s: | None |
| Type: | New Feature | Priority: | Minor - P4 |
| Reporter: | Matt Campbell | Assignee: | Backlog - Query Execution |
| Resolution: | Unresolved | Votes: | 1 |
| Labels: | document, stats | ||
| Remaining Estimate: | Not Specified | ||
| Time Spent: | Not Specified | ||
| Original Estimate: | Not Specified | ||
| Issue Links: |
|
||||||||
| Assigned Teams: |
Query Execution
|
||||||||
| Participants: | |||||||||
| Description |
|
Implement stats similar to those found on the db and col levels. Currently there is no efficient way of obtaining stats such as the size of a document without sending the document down the wire to the client and bson encoding the document. Suggest storing document stats as meta data beside each document in a collection but only return such stats data when requested as shown in the following examples. Return a summary (aggregation of stats): db.col.findOne({}).stats();
Return documents and stats embedded using a flag on the find() operation: db.col.find({}, {stats:true});
As you can see from the examples above this would be best implemented on the server cursor. I would suggest storing stats meta data beside documents on disk as opposed to storing them in a separate hash table or other data structure. This is to ensure efficient retrievals of both documents and stats in a flexible manner and to ensure writes remain fast. |
| Comments |
| Comment by Eric Milkie [ 19/Feb/19 ] |
|
The aggregation pipeline could provide this sort of information with new operators. |
| Comment by Matt Campbell [ 08/Nov/12 ] |
|
For both single and aggregated doc stats: ns (collection which document is stored in - useful if you are passing objects around a system without being context bound or have a wrapper) For aggregated doc stats: count (number docs in cursor / stats aggregate) Possible ideas (not considered core): count of keys per doc (could be top level or drilldown into embedded docs) RATIONALE: ns - allows a document to traverse through an application knowing its 'home' and having identity dataSize - useful for clients which may be bandwidth aware and want to know the size of set of document before choosing to pull them down the wire (ie think mobile or other bandwidth constrain or resource constrained device). This would allow them to make decisions on how much data to pull down. storageSize - in multi-tenant environments this would allow us to quickly report the physical disk used by a set of documents belonging to a client (contained in a single shared collection). Eg a multi-tenant collection of products we would be able to quickly report the disk usage in a dashboard to each user for that type of object count - simple - number docs in cursor |
| Comment by Eliot Horowitz (Inactive) [ 08/Nov/12 ] |
|
Besides size, what stats are you looking for? |