[SERVER-15561] I want to apply notablescan to per DB or per COLLECTION on production. Created: 08/Oct/14  Updated: 07/Apr/23

Status: Backlog
Project: Core Server
Component/s: Querying
Affects Version/s: None
Fix Version/s: None

Type: New Feature Priority: Major - P3
Reporter: Hiroaki Assignee: Backlog - Query Optimization
Resolution: Unresolved Votes: 10
Labels: asya
Remaining Estimate: Not Specified
Time Spent: Not Specified
Original Estimate: Not Specified

Issue Links:
Depends
Related
related to SERVER-1143 Allow --notablescan to be specified p... Backlog
is related to SERVER-34127 Add user role action that will allow ... Closed
Assigned Teams:
Query Optimization
Participants:
Case:

 Description   

WISH

Want to use notablescan on the production DB.
Want to apply notablescan to per DB or per COLLECTION.

REASON

We can kill our mongod easily by sending query with no indexed field to the more than hundreds GB of collection.
To make matters worse, we'll get same results by specifying non-existent field cause by simple typo.

The feature of notablescan option can prevent these catastrophic incidents.
Especially, on the production DB.

ADDITIONAL

But currently, likely to add this sentence to mongo-docs.

+ Don't run production :program:`mongod` instances with
+ :parameter:`notablescan` because preventing table scans can potentially
+ affect queries in all databases, including administrative queries.

https://github.com/mongodb/docs/commit/43a37686f53102e639a33d404e9f73f47d1729a6#diff-ee73e0a6a2ede9af5743e69b8fad4f80R128

I think, this is the wrong policy to keep our mongo system safety.
On the contrary, I want to come to be that the notablescan option is applicable per DB or per COLLECTION.



 Comments   
Comment by Asya Kamsky [ 19/Jul/18 ]

Most related tickets talk about "typos" - generally that only happens in the shell during interactive sessions.  I opened SERVER-36196 to make it possible to determine in the shell whether it's being run in interactive mode - that would allow us to alter the shell helpers to either disallow or warn on attempts to query a large collection without an index.

 

Comment by Glen Miner [ 11/Jun/15 ]

As I mentioned in SERVER-1143 I think what would be best would be "notablescan by default" but you can opt out like rs.slaveOk() when you need to do something dangerous. This would prevent a large number of fat-finger fires.

Comment by Ramon Fernandez Marina [ 14/Oct/14 ]

Understood, thanks crumbjp. We're keeping this ticket open to consider your request for a future version of MongoDB.

Comment by Hiroaki [ 10/Oct/14 ]

I don't want to investigate the reason of the mongod killed.

This issue is obviously not caused from any bugs.
Simply, caused from (unintentionally) heavy load.

Not only MnogoDB, this is the theory of the all of the system that under the high-load.
The operation of these kind of the system is very tense.
So I want to reduce the risk of simple typo.

Comment by Ramon Fernandez Marina [ 09/Oct/14 ]

Hi crumbjp,

without full logs my only guess is that mongod could be being killed by the OOM killer from the OS, but we'd like to make sure there are no other bugs lurking that may be causing this issue. If you could upload full logs from the mongod process from startup until one of your queries kills it that would help us investigate the issue. If it's easy for you to reproduce the scenario you describe, please increase the logLevel to 1 and upload the resulting logs from startup of the mongod process.

Additionally, we'd like to consider your suggestion so we're either going to keep this ticket open, or merge it with SERVER-1143 which suggest a similar enhancement.

Please let us know if you can upload the requested logs.

Thanks,
Ramón.

Comment by Hiroaki [ 08/Oct/14 ]

OS: Linux (Cent, Ubuntu)
storage: SSD
memory: more than 60 GB
Data size: more than 500GB ~ ?TB

If I issue a fullscan query, the OP occupy large resource, and turn hot-data out and cause heavy thrashing.
So, entire the query will slowdown and mongod will be silent and finally died.
Continuous these events will conclude in some minutes.

I know that this is the harsh environment case.
Therefor to begin with, MongoDB is the product that intended for the above. I believe.

The maxTimeMS is good feature.
I already use it when I FEEL THE QUERY IS DANGER.
But I don't feel use it to all my query. (And difficult to prospect the time of query before issue )

Comment by Ramon Fernandez Marina [ 08/Oct/14 ]

crumbjp, a tablescan should not kill mongod. If this happens to you it would be great if you could provide additional information to investigate if there's a bug somewhere. In particular:

  • What platform are you using? OS name and version, virtualization platform if any, memory size...
  • What kind of MongoDB deployment do you have? Are you using sharding? Replication?
  • Are you able to upload logs from the mongod node that becomes unavailable during a tablescan?
  • What does your schema look like? What type of queries make mongod unavailable?

In the meantime you may want to investigate the use of maxTimeMS and/or maxScan in your queries, that may help you work around this issue.

Looking forward to hear more details so we can get to the bottom of this.

Regards,
Ramón.

Generated at Thu Feb 08 03:38:22 UTC 2024 using Jira 9.7.1#970001-sha1:2222b88b221c4928ef0de3161136cc90c8356a66.