[CSHARP-984] The Queryable translation for Contains converts to a non working regular expression. Created: 03/Jun/14  Updated: 05/Apr/16  Resolved: 20/Jun/14

Status: Closed
Project: C# Driver
Component/s: API
Affects Version/s: 1.9
Fix Version/s: None

Type: Bug Priority: Critical - P2
Reporter: Paul Reed Assignee: Robert Stam
Resolution: Done Votes: 0
Labels: None
Remaining Estimate: Not Specified
Time Spent: Not Specified
Original Estimate: Not Specified
Environment:

windows 2013



 Description   

When using this form of c# code (simplified):

>> var collection = MongoDataConnection.GetCollection<T>(true);
>> IQueryable<T> qyb = collection.AsQueryable();
>> var cursor = qyb.Where( e=>e.Name.Contains(astring) ).AsQueryable();

The resulting mongo command is converted to:

>> collection.Find(

{Name:/astring/s}

)

which fails with Invalid flags supplied to Regex constructor 's'
This was also consuming 99% CPU on our servers. This problem has only just materialised.
When in shell I run :

>> collection.Find(

{Name:/astring/s}

)

I still get the error - if I convert this to

>> collection.Find({Name:{$regex:"astring",$options:"s"}})

it works fine..

This is a HUGE issue for us.



 Comments   
Comment by Robert Stam [ 20/Jun/14 ]

We think everything is working as designed. Please feel free to reopen and we can continue the discussion if you would like or if new information comes to light. Thanks!

Comment by Paul Reed [ 13/Jun/14 ]

Ok, ill need to double check. Let me run some tests again on Monday.
Maybe red herring.
Thanks From: Robert Stam (JIRA)
Sent: ‎13/‎06/‎2014 19:46
To: mr.paul.reed@gmail.com
Subject: [MongoDB-JIRA] (CSHARP-984) The Queryable translation for
Contains converts to a non working regular expression.

[ https://jira.mongodb.org/browse/CSHARP-984?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=620567#comment-620567
]

Robert Stam commented on CSHARP-984:
------------------------------------

I am unable to reproduce the error you are getting.

Here is the code I was using:

public class C
{
    public ObjectId Id { get; set; }
    public string s { get; set; }
}
 
public static void Main(string[] args)
{
    var client = new MongoClient("mongodb://localhost");
    var server = client.GetServer();
    var database = server.GetDatabase("test");
    var collection = database.GetCollection<C>("test");
 
    var queryable = collection.AsQueryable().Where(c => c.s.Contains("acme"));
    Console.WriteLine("query: {0}",
((MongoQueryable<C>)queryable).GetMongoQuery());
    foreach (var document in queryable)
    {
        Console.WriteLine(document.ToJson());
    }
    Console.ReadLine();
}

And here is the output of running that code:

query: { "s" : /acme/s }
{ "_id" : ObjectId("539b303c9d80a74025e3a534"), "s" : "acme corp" }
{ "_id" : ObjectId("539b30469d80a74025e3a535"), "s" : "acmeblahcorp" }

The server is returning the expected documents and no error occurred.

I can of course reproduce the error in the shell, but that is just
because JavaScript regular expressions don't support the "s" option:

2014-06-13T14:42:28.079-0400 SyntaxError: Invalid flags supplied to
RegExp constructor 's'


This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira

Comment by Paul Reed [ 13/Jun/14 ]

The table in question is only 18000 docs and the server stuck at high
cpu for many many mins, something def was amiss. High cpu not seen
anywhere else, failure returned in error immediately. From: Robert Stam
(JIRA)
Sent: ‎13/‎06/‎2014 20:40
To: mr.paul.reed@gmail.com
Subject: [MongoDB-JIRA] (CSHARP-984) The Queryable translation for
Contains converts to a non working regular expression.

[ https://jira.mongodb.org/browse/CSHARP-984?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=620655#comment-620655
]

Robert Stam commented on CSHARP-984:
------------------------------------

It has also been pointed out to me that this kind of regular
expression query is always going to be CPU intensive because it
can't use an index and must therefore do a full collection scan, and
regular expression matching in general is CPU intensive. So perhaps
the 99% CPU you were seeing is totally normal, given the query.


This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira

Comment by Robert Stam [ 13/Jun/14 ]

It has also been pointed out to me that this kind of regular expression query is always going to be CPU intensive because it can't use an index and must therefore do a full collection scan, and regular expression matching in general is CPU intensive. So perhaps the 99% CPU you were seeing is totally normal, given the query.

Comment by Robert Stam [ 13/Jun/14 ]

I am unable to reproduce the error you are getting.

Here is the code I was using:

public class C
{
    public ObjectId Id { get; set; }
    public string s { get; set; }
}
 
public static void Main(string[] args)
{
    var client = new MongoClient("mongodb://localhost");
    var server = client.GetServer();
    var database = server.GetDatabase("test");
    var collection = database.GetCollection<C>("test");
 
    var queryable = collection.AsQueryable().Where(c => c.s.Contains("acme"));
    Console.WriteLine("query: {0}", ((MongoQueryable<C>)queryable).GetMongoQuery());
    foreach (var document in queryable)
    {
        Console.WriteLine(document.ToJson());
    }
    Console.ReadLine();
}

And here is the output of running that code:

query: { "s" : /acme/s }
{ "_id" : ObjectId("539b303c9d80a74025e3a534"), "s" : "acme corp" }
{ "_id" : ObjectId("539b30469d80a74025e3a535"), "s" : "acmeblahcorp" }

The server is returning the expected documents and no error occurred.

I can of course reproduce the error in the shell, but that is just because JavaScript regular expressions don't support the "s" option:

> db.test.find({ s : /acme.*/s })
2014-06-13T14:42:28.079-0400 SyntaxError: Invalid flags supplied to RegExp constructor 's'
>

Comment by Paul Reed [ 13/Jun/14 ]

When looking at the query hitting the server, it appears to hang up.
And does contain the invalid query construct with the regex Although
not 100% sure of all that at my phone. I will try to cause issue in
some test code Monday to send on.

I have now worked around the issue however. But certainly no data, only
error was returned before. From: Robert Stam (JIRA)
Sent: ‎13/‎06/‎2014 18:37
To: mr.paul.reed@gmail.com
Subject: [MongoDB-JIRA] (CSHARP-984) The Queryable translation for
Contains converts to a non working regular expression.

[ https://jira.mongodb.org/browse/CSHARP-984?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=620457#comment-620457
]

Robert Stam commented on CSHARP-984:
------------------------------------

Reading the documentation linked to above more carefully, it is now
unclear to me whether the alternate form is only required in the shell
(to sidestep the limitation that JavaScript's native Regex
implementation doesn't support the "s" flag), or whether the server
requires this form as well.

It seems like as long as the driver has a way to construct a proper
BSON regular expression value (and the C# driver does), then it should
work at the server as well.

Are you actually getting any errors from the server when running this
query? Or is the server returning the correct results, just that it's
using 99% CPU to do so?


This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira

Comment by Robert Stam [ 13/Jun/14 ]

Reading the documentation linked to above more carefully, it is now unclear to me whether the alternate form is only required in the shell (to sidestep the limitation that JavaScript's native Regex implementation doesn't support the "s" flag), or whether the server requires this form as well.

It seems like as long as the driver has a way to construct a proper BSON regular expression value (and the C# driver does), then it should work at the server as well.

Are you actually getting any errors from the server when running this query? Or is the server returning the correct results, just that it's using 99% CPU to do so?

Comment by Robert Stam [ 13/Jun/14 ]

The server documentation for queries with regular expressions is here:

http://docs.mongodb.org/manual/reference/operator/query/regex/

According to this documentation we need to be using:

{ field : { $regex : "pattern", $options : "s" } }

instead of

{ field : /pattern/s }

Comment by Paul Reed [ 03/Jun/14 ]

Well we have had the system up and running for a few months now, this method forms part of a search which again has been functional for months. So either, people just haven;t been using it (I am sure they have) or I am being stupid some how.

I am pretty sure that I have not had high CPU stuck before either.

I have worked a work around, but it does code bloat somewhat. I understand that the regex problem is known within mongo, I would hope that the c# driver could be adapted to work around it.

Kind Regards,

Paul Reed

Comment by Craig Wilson [ 03/Jun/14 ]

Sorry you are having some trouble. We see in the docs that this is documented, so we'll definitely get this fixed. You mentioned that "this problem has only just materialised." Would you mind expanding on this? What changed that is causing this to fail now when it wasn't failing before?

For now, I can provide you a work around. You can build your query using the query builders and then inject it in, or simply use the query document with Find. It would look like the below:

var query = new QueryDocument
{
  { "Name", new BsonDocument 
    {
      { "$regex", "astring" },
      { "$options", "s" }
    }
  }
};
 
qyb.Where(e => query.Inject());
 
// or
 
collection.Find(query);

Generated at Wed Feb 07 21:38:22 UTC 2024 using Jira 9.7.1#970001-sha1:2222b88b221c4928ef0de3161136cc90c8356a66.