[SERVER-8436] Aggregation pipeline cannot process system.profile doc fields with $ prefixes Created: 01/Feb/13 Updated: 06/Dec/22 |
|
| Status: | Backlog |
| Project: | Core Server |
| Component/s: | Aggregation Framework |
| Affects Version/s: | 2.3.2 |
| Fix Version/s: | None |
| Type: | Improvement | Priority: | Minor - P4 |
| Reporter: | Jeremy Mikola | Assignee: | Backlog - Query Optimization |
| Resolution: | Unresolved | Votes: | 3 |
| Labels: | query-44-grooming, usability | ||
| Remaining Estimate: | Not Specified | ||
| Time Spent: | Not Specified | ||
| Original Estimate: | Not Specified | ||
| Issue Links: |
|
||||||||||||||||||||||||||||||||||||
| Assigned Teams: |
Query Optimization
|
||||||||||||||||||||||||||||||||||||
| Participants: | |||||||||||||||||||||||||||||||||||||
| Case: | (copied to CRM) | ||||||||||||||||||||||||||||||||||||
| Description |
|
Due to strict FieldPath validation, aggregation pipelines cannot process fields that start with a $ character (commonly found in the system.profile collection). Steps to reproduce (copied here so code formatting works):
Exception:
|
| Comments |
| Comment by Asya Kamsky [ 04/Aug/17 ] |
|
Starting with 3.4.4 it's possible to work around this limitation in aggregation language with new $objectToArray and $arrayToObject expressions. There are examples of such syntax in this comment and also this comment |
| Comment by Craig Wilson [ 15/Jul/13 ] |
|
That's very interesting. It seems like the crux of the problem here is that you are doing an upsert and the server is adding in fields meant to constrain (MLSStatus, ListPrice, _keywords) because if thinks they are supposed to be in the document. You need to be providing default values for things like MLSStatus in the update document in order for them to be written appropriately. For instance, what is the ListPrice for the document if it ends up getting inserted? What is the MLSStatus if the document is inserted? In server 2.4, we added the $setOnInsert update operator that allows you to provide default values for these. Unfortunately, server 2.3.2 doesn't have this support. My suggestion to you is to not do an upsert here and instead get back 0 documents. When this happens, you'll need to do an insert manually and then proceed on as you were before. |
| Comment by John Cole [ 15/Jul/13 ] |
|
Here is the resulting object from the database: { , , , , , , }, |
| Comment by Craig Wilson [ 15/Jul/13 ] |
|
This is definitely a server issue in regards to the system.profile collection. To query this, you likely need to read out of the profile collection, sanitize the dollars signs to something else and then save them to another collection. MongoDefaults is a static class where you can do some overriding. You can also change some of the settings at the client, database, or collection level by using MongoClientSettings, MongoDatabaseSettings, or MongoCollectionSettings respectively. |
| Comment by John Cole [ 15/Jul/13 ] |
|
How can we override the defaults, that might come in handy. We aren't doing anything special, just using FindAndModify. Here is the code that performs the upsert: return Collection.FindAndModify(query, sortBy, update, returnNew, upsert); Collection is a private property: private MongoCollection<T> Collection } query = { "query" : { "sys" : "gcar", "MLSStatus" : { "$in" : ["Active"] }, "hide" : { "$nin" : [1] }, "OfficeExclusive" : { "$nin" : [1] }, "ListPrice" : { "$gte" : 172000, "$lte" : 182900 }, "_keywords" : { "$all" : ["foobar2"] }, "hideaddress" : { "$ne" : 1 }} } and update = { "$set" : { "form" : { "ListPriceMinVal" : 172000, "ListPriceMaxVal" : 182900, "BathsTotalMinVal" : 0, "BedsTotalMinVal" : 0, "Location" : "foobar2" }, "hits" : 1, "ts" : ISODate("2013-07-15T15:05:38.211Z") } } upsert = true sortBy = null _dburl in development is mongodb://localhost/sprint1309?w=1 T for this object is: public class SearchLog public BsonDocument query { get; set; } public QuickSearch form { get; set; }public DateTime ts { get; set; } public int hits { get; set; }} c# driver 1.8.0 As I mentioned, this works very well for us and we like it, we just can't use an agg pipeline on it. And while it doesn't address the OP about using agg pipelines against system.profile (which we patterned our query storage collection after), it would be nice to have a suggestion to how we should store query documents in a document. If it's a bug, please don't "fix" it quickly. We will need time to find another way to store queries. |
| Comment by Craig Wilson [ 15/Jul/13 ] |
|
John, Thanks, |
| Comment by John Cole [ 11/Jul/13 ] |
|
Dan, We understand that having field names starting with '$' is not supposed to be allowed, but because it currently is possible with 2.4.3 and the C# 1.8.0 driver, we have been doing this, and everything except aggregation framework has worked fine. Even matching against fields starting with '$' currently works. Because of the problems brought up with the OP, we are looking at what it will take to escape our saved queries so fields do not start with a '$', and we expect this to address our issue running a aggregation pipeline against collections with field names starting with '$'. However, this will not address the OP issue, which we have also run into, of running a agg pipeline against the system.profile collection. To address this, allowing the $Project operator to work on fields starting with '$' would be a nice way of allowing agg pipelines to work with the system.profile collection. It would also work if the system.profile collection escaped the fields staring with '$', as we are going to have to do on our collections in the future. J |
| Comment by Daniel Pasette (Inactive) [ 11/Jul/13 ] |
|
Hi jcole@solidearth.com, from your first comment I understood you were saying that you are inserting fields starting with '$' directly, but it appears you are just repeating the description of the issue as reported by jmikola@gmail.com that you cannot run aggregations on the profiling data. This constraint cannot be relaxed as the '$' is reserved for operators. For the moment the only workaround I know of is to pull the data out and transform it client side. |
| Comment by John Woakes [ 10/Jul/13 ] |
|
The problem is that the profile data Mongo stores in the system.profile collection does have fields with names that start with $ because they are storing commands that have been executed and some of them start with a $. For example $exists looks like "query" : { "somefield" : { "$exists" : true }} in the profile collection. The aggregation pipeline cannot process this entry. |
| Comment by John Cole [ 10/Jul/13 ] |
|
Were using the C# 1.8.0 driver. We are storing search queries for our saved search and collections features, and mimicking the system.profile seemed like a good idea as it allowed us to quickly save and recover a query. What is 10gen's suggested method for storing queries in a document? |
| Comment by Daniel Pasette (Inactive) [ 10/Jul/13 ] |
|
jcole@solidearth.com, you shouldn't normally be able to store a field name starting with a '$' from most drivers. |
| Comment by John Cole [ 09/Jul/13 ] |
|
With 2.4.3 you can save a subfield starting with a $. For example: { } However you cannot process this in the aggregation framework. The related issue was fixed in 2.2.0-rc1 so this seems to be valid currently. We are storing our query documents and would like to run statistics on them, but because the Field Path check is failing. As a minimum, it would be nice if the $project task allowed you to rename a column so it could be used by the rest of the pipeline. |