[SERVER-27693] dbcommands to access ftdc archive files Created: 16/Jan/17  Updated: 06/Dec/22

Status: Backlog
Project: Core Server
Component/s: Diagnostics
Affects Version/s: None
Fix Version/s: None

Type: Improvement Priority: Minor - P4
Reporter: Attila Tozser Assignee: Backlog - Query Optimization
Resolution: Unresolved Votes: 1
Labels: None
Remaining Estimate: Not Specified
Time Spent: Not Specified
Original Estimate: Not Specified

Issue Links:
Related
related to COMPASS-1661 Build a Compass plugin to view FTDC (... Closed
Assigned Teams:
Query Optimization
Backwards Compatibility: Fully Compatible
Participants:

 Description   

Hi,

As there are these data files full of useful information in the diagnostic.data, it would be nice to have a way built in the server to use them somehow.

I wrote a simple interface utilizing two commands:

getDiagnosticDataFiles

set1:PRIMARY> db.adminCommand({"getDiagnosticDataFiles":1})
{
	"data" : [
		"metrics.2017-01-14T18-40-10Z-00000",
		"metrics.2017-01-15T13-52-41Z-00000",
		"metrics.interim"
	],
	"ok" : 1
}
set1:PRIMARY> 

To list the available files and

getDiagnosticDataFromFile

set1:PRIMARY> db.adminCommand({"getDiagnosticDataFromFile":1,
limit:0,
skip:0,
showOutput:false,
startDate: ISODate("2017-01-11T10:02:00.024Z"),
endDate: ISODate("2017-01-11T11:02:29.024Z"), 
filename:"metrics.2017-01-11T09-02-29Z-00000"})
{
	"numDocumentsRead" : 54151,
	"numDocumentsMatched" : 3629,
	"data" : [ ],
	"startDateFilter" : ISODate("2017-01-11T10:02:00.024Z"),
	"endDateFilter" : ISODate("2017-01-11T11:02:29.024Z"),
	"skip" : NumberLong(0),
	"limit" : NumberLong(0),
	"ok" : 1
}

to read the content of a file with the parameters:

  • filename : actually the filename of the archive we would like to handle
  • skip, limit : skip and limit, skip defults to 0 limit default is 100
  • showOutput: turn off data output generation to check filesize in records for example. (see the limitation described later)
  • startDate: start date of data output generation default: 1970. január 1. 00:00:00
  • endDate: end date of data output generation default DATENOW

Will create shortly a pull request, and would kindly ask for your suggestions, what to change in the implementation. I reused the getDiagnosticData commands structure, together with the ftdc_test implementation to list the directory content, and to parse the ftdc archive file.

The output generation has the generic 64MB limitation, which is somewhere around 600-700 documents/ftdc entries.

The interface is compatible with the 3.2 version of the diagnostic files aswell.

Best,
Attila



 Comments   
Comment by Kelsey Schubert [ 08/Jan/18 ]

Hi atozser,

Thank you for the pull request and sorry for the delay as we considered how to best fulfill this use case. We have decided to provide FTDC data as an aggregation stage, rather than an additional system command to avoid size limit of returned data and to allow additional filtering via familiar $match syntax. Once we have finalized design for exact syntax and options, we will post them here.

Kind regards,
Kelsey

Comment by Attila Tozser [ 07/Apr/17 ]

Hi,

Thanks for your time to look into this request.

The use-case simply would be the possiblity to read the diagnostic data, as it is there, but not accessible. I think in most of the cases the last 60-120 seconds counts as a valid request, which even with the current implementation would be sufficiently fast and harmless (and you can write a note in the docs, that this is not fast and/or you can make it an optional experimental feature). A standalone tool is good also but this has the same functionality. (The ultimate solution would be to have an efficient timeseries storage engine). What would be the ideal solution in your opinion to make the diagnostic data readable?

Best,
Attila

Comment by Mark Benvenuto [ 07/Apr/17 ]

Since there has not been any updates to this ticket in a month, I am closing this ticket. Free free to reopen this ticket.

Comment by Mark Benvenuto [ 06/Mar/17 ]

atozser, what are the use cases you are trying to solve by adding the commands to query all the FTDC data? Would the getDiagnosticData() command which returns the most recent FTDC record that was added in 3.4 be sufficient? Would a standalone tool to read the data better meet your needs?

The FTDCFileReader was never optimized for reading large FTDC files. For instance, it cannot seek into the file based on date, and also is designed to return all the data as BSON which is less efficient then keeping it as an array of numbers.

Comment by Ramon Fernandez Marina [ 23/Feb/17 ]

Apologies for the radio silence atozser, we're discussing internally whether this is something we want to include as part of the server. There could be other options, like the ability to use the diagnostic data as an aggregation source for example. Mark (current assignee) will look at your pull request and get back to you.

Regards,
Ramón.

Comment by Attila Tozser [ 22/Feb/17 ]

Hi Ramon,

Do you need some updates from me?

Best,
Attila

Comment by Ramon Fernandez Marina [ 16/Jan/17 ]

Thanks atozser, I'm sending this to the Platform team for evaluation.

Generated at Thu Feb 08 04:15:55 UTC 2024 using Jira 9.7.1#970001-sha1:2222b88b221c4928ef0de3161136cc90c8356a66.