[SERVER-20364] Cursor is not closed when querying system.profile collection with clusterMonitor role Created: 10/Sep/15  Updated: 20/Apr/16  Resolved: 16/Sep/15

Status: Closed
Project: Core Server
Component/s: Querying, Security
Affects Version/s: None
Fix Version/s: 3.0.7, 3.1.9

Type: Bug Priority: Major - P3
Reporter: John Morales Assignee: Andreas Nilsson
Resolution: Done Votes: 0
Labels: None
Remaining Estimate: Not Specified
Time Spent: Not Specified
Original Estimate: Not Specified

Issue Links:
Related
is related to SERVER-9609 Ensure users can only call getMore on... Closed
Backwards Compatibility: Fully Compatible
Operating System: ALL
Backport Completed:
Steps To Reproduce:

(Not exactly clear which of these steps, if any, are optional. These are just the final steps I took. Repro's 100%.)
1.) Start new replica set with auth.
2.) Add two users:
2a.) "admin" with roles: userAdminAnyDatabase, readWriteAnyDatabase, clusterAdmin, dbAdminAnyDatabase, clusterManager
2b.) "clustermonitor" with role "clusterMonitor" only.
3.) enable profiler level 2 on "test" database.
4.) as "admin" user, just perform some queries to populate "system.profile" collection. (I had a few hundred entries.)
5.) Go reproducer or Java reproducer
(I tried using just the shell, but I wasn't clear on how to perform the explicit killCursor call from shell. If possible, I imagine that'd work, too.)

Note if you instead use the "admin" user, the cursor is closed.

Sprint: Security 9 (09/18/15)
Participants:

 Description   

When querying the "system.profile" collection with a user containing only the clusterMonitor role, closing the cursor appears to fail and "leak" the cursor until the 10 minute timeout. However querying with a user with higher privileges closes the cursor successfully.

Repro steps are above. I tried with both mgo and Java drivers just to double check.

Log level 2 output from a reproducer:

2015-09-10T18:31:50.300-0400 I NETWORK  [initandlisten] connection accepted from 127.0.0.1:64007 #2199 (4 connections now open)
2015-09-10T18:31:50.307-0400 D COMMAND  [conn2199] run command admin.$cmd { buildinfo: 1 }
2015-09-10T18:31:50.308-0400 I COMMAND  [conn2199] command admin.$cmd command: buildInfo { buildinfo: 1 } keyUpdates:0 writeConflicts:0 numYields:0 reslen:977 locks:{} 0ms
2015-09-10T18:31:50.316-0400 D COMMAND  [conn2199] run command admin.$cmd { ismaster: 1 }
2015-09-10T18:31:50.316-0400 I COMMAND  [conn2199] command admin.$cmd command: isMaster { ismaster: 1 } keyUpdates:0 writeConflicts:0 numYields:0 reslen:465 locks:{} 0ms
2015-09-10T18:31:50.344-0400 I NETWORK  [initandlisten] connection accepted from 127.0.0.1:64009 #2200 (5 connections now open)
2015-09-10T18:31:50.344-0400 D COMMAND  [conn2200] run command admin.$cmd { buildinfo: 1 }
2015-09-10T18:31:50.344-0400 I COMMAND  [conn2200] command admin.$cmd command: buildInfo { buildinfo: 1 } keyUpdates:0 writeConflicts:0 numYields:0 reslen:977 locks:{} 0ms
2015-09-10T18:31:50.357-0400 D COMMAND  [conn2200] run command admin.$cmd { saslStart: 1, mechanism: "SCRAM-SHA-1", payload: BinData(0, 6E2C2C6E3D636C75737465726D6F6E69746F722C723D585E292A4C7B2164637D744B4C2965796D556548265C4677) }
2015-09-10T18:31:50.358-0400 D QUERY    [conn2200] Relevant index 0 is kp: { user: 1, db: 1 } io: { v: 1, unique: true, key: { user: 1, db: 1 }, name: "user_1_db_1", ns: "admin.system.users" }
2015-09-10T18:31:50.358-0400 D QUERY    [conn2200] Only one plan is available; it will be run but will not be cached. query: { user: "clustermonitor", db: "admin" } sort: {} projection: {} skip: 0 limit: 0, planSummary: IXSCAN { user: 1, db: 1 }
2015-09-10T18:31:50.358-0400 D STORAGE  [conn2200] WT begin_transaction
2015-09-10T18:31:50.358-0400 D STORAGE  [conn2200] WT rollback_transaction
2015-09-10T18:31:50.362-0400 I COMMAND  [conn2200] command admin.$cmd command: saslStart { saslStart: 1, mechanism: "SCRAM-SHA-1", payload: BinData(0, 6E2C2C6E3D636C75737465726D6F6E69746F722C723D585E292A4C7B2164637D744B4C2965796D556548265C4677) } keyUpdates:0 writeConflicts:0 numYields:0 reslen:171 locks:{ G
lobal: { acquireCount: { r: 2 } }, Database: { acquireCount: { r: 1 } }, Collection: { acquireCount: { r: 1 } } } 4ms
2015-09-10T18:31:50.362-0400 D STORAGE  [conn2200] WT begin_transaction
2015-09-10T18:31:50.363-0400 D STORAGE  [conn2200] WiredTigerRecoveryUnit::beingReleased
2015-09-10T18:31:50.363-0400 D STORAGE  [conn2200] WiredTigerRecoveryUnit::broughtBack
2015-09-10T18:31:50.363-0400 D STORAGE  [conn2200] WT begin_transaction
2015-09-10T18:31:50.363-0400 D STORAGE  [conn2200] WT commit_transaction
2015-09-10T18:31:50.363-0400 D STORAGE  [conn2200] WiredTigerRecoveryUnit::beingReleased
2015-09-10T18:31:50.363-0400 D STORAGE  [conn2200] WiredTigerRecoveryUnit::broughtBack
2015-09-10T18:31:50.363-0400 D STORAGE  [conn2200] WT commit_transaction
2015-09-10T18:31:50.582-0400 D COMMAND  [conn2200] run command admin.$cmd { saslContinue: 1, conversationId: 1, payload: BinData(0, 633D626977732C723D585E292A4C7B2164637D744B4C2965796D556548265C46772B5264594D65677343656D6871664D6841425237646A2F46556573484735505A2C703D576B...) }
2015-09-10T18:31:50.582-0400 I COMMAND  [conn2200] command admin.$cmd command: saslContinue { saslContinue: 1, conversationId: 1, payload: BinData(0, 633D626977732C723D585E292A4C7B2164637D744B4C2965796D556548265C46772B5264594D65677343656D6871664D6841425237646A2F46556573484735505A2C703D576B...) } keyUpdates:0
 writeConflicts:0 numYields:0 reslen:108 locks:{} 0ms
2015-09-10T18:31:50.583-0400 D COMMAND  [conn2200] run command admin.$cmd { saslContinue: 1, conversationId: 1, payload: BinData(0, 763D4E373437683752596F3274504D4C49333047325862626D4E6D4D673D) }
2015-09-10T18:31:50.583-0400 D QUERY    [conn2200] Relevant index 0 is kp: { user: 1, db: 1 } io: { v: 1, unique: true, key: { user: 1, db: 1 }, name: "user_1_db_1", ns: "admin.system.users" }
2015-09-10T18:31:50.583-0400 D QUERY    [conn2200] Only one plan is available; it will be run but will not be cached. query: { user: "clustermonitor", db: "admin" } sort: {} projection: {} skip: 0 limit: 0, planSummary: IXSCAN { user: 1, db: 1 }
2015-09-10T18:31:50.583-0400 D STORAGE  [conn2200] WT begin_transaction
2015-09-10T18:31:50.583-0400 D STORAGE  [conn2200] WT rollback_transaction
2015-09-10T18:31:50.583-0400 I ACCESS   [conn2200] Successfully authenticated as principal clustermonitor on admin
2015-09-10T18:31:50.584-0400 I COMMAND  [conn2200] command admin.$cmd command: saslContinue { saslContinue: 1, conversationId: 1, payload: BinData(0, 763D4E373437683752596F3274504D4C49333047325862626D4E6D4D673D) } keyUpdates:0 writeConflicts:0 numYields:0 reslen:78 locks:{ Global: { acquireCount: { r: 2 } },
 Database: { acquireCount: { r: 1 } }, Collection: { acquireCount: { r: 1 } } } 0ms
2015-09-10T18:31:50.584-0400 D STORAGE  [conn2200] WT begin_transaction
2015-09-10T18:31:50.584-0400 D STORAGE  [conn2200] WiredTigerRecoveryUnit::beingReleased
2015-09-10T18:31:50.584-0400 D STORAGE  [conn2200] WiredTigerRecoveryUnit::broughtBack
2015-09-10T18:31:50.584-0400 D STORAGE  [conn2200] WT begin_transaction
2015-09-10T18:31:50.584-0400 D STORAGE  [conn2200] WT commit_transaction
2015-09-10T18:31:50.584-0400 D STORAGE  [conn2200] WiredTigerRecoveryUnit::beingReleased
2015-09-10T18:31:50.584-0400 D STORAGE  [conn2200] WiredTigerRecoveryUnit::broughtBack
2015-09-10T18:31:50.584-0400 D STORAGE  [conn2200] WT commit_transaction
2015-09-10T18:31:50.584-0400 D QUERY    [conn2200] Running query: query: {} sort: { $natural: -1 } projection: {} skip: 0 limit: 20
2015-09-10T18:31:50.584-0400 D QUERY    [conn2200] Only one plan is available; it will be run but will not be cached. query: {} sort: { $natural: -1 } projection: {} skip: 0 limit: 20, planSummary: COLLSCAN
2015-09-10T18:31:50.584-0400 D STORAGE  [conn2200] WT begin_transaction
2015-09-10T18:31:50.584-0400 D STORAGE  [conn2200] WT rollback_transaction
2015-09-10T18:31:50.584-0400 D STORAGE  [conn2200] WiredTigerRecoveryUnit::beingReleased
2015-09-10T18:31:50.584-0400 D STORAGE  [conn2200] WiredTigerRecoveryUnit::broughtBack
2015-09-10T18:31:50.584-0400 I QUERY    [conn2200] query admin.system.profile query: { $query: {}, $orderby: { $natural: -1 } } planSummary: COLLSCAN cursorid:39651685655 ntoreturn:20 ntoskip:0 nscanned:0 nscannedObjects:20 keyUpdates:0 writeConflicts:0 numYields:0 nreturned:20 reslen:11243 locks:{ Global: {
 acquireCount: { r: 2 } }, Database: { acquireCount: { r: 1 } }, Collection: { acquireCount: { r: 1 } } } 0ms
2015-09-10T18:31:50.585-0400 D STORAGE  [conn2200] WT begin_transaction
2015-09-10T18:31:50.585-0400 D STORAGE  [conn2200] WiredTigerRecoveryUnit::beingReleased
2015-09-10T18:31:50.585-0400 D STORAGE  [conn2200] WiredTigerRecoveryUnit::broughtBack
2015-09-10T18:31:50.585-0400 D STORAGE  [conn2200] WT begin_transaction
2015-09-10T18:31:50.585-0400 D STORAGE  [conn2200] WT commit_transaction
2015-09-10T18:31:50.585-0400 D STORAGE  [conn2200] WiredTigerRecoveryUnit::beingReleased
2015-09-10T18:31:50.585-0400 D STORAGE  [conn2200] WiredTigerRecoveryUnit::broughtBack
2015-09-10T18:31:50.585-0400 D STORAGE  [conn2200] WT commit_transaction
2015-09-10T18:31:50.597-0400 I COMMAND  [conn2200] killcursors: found 0 of 1
2015-09-10T18:31:50.598-0400 I QUERY    [conn2200] killcursors  keyUpdates:0 writeConflicts:0 numYields:0 locks:{} 0ms
2015-09-10T18:31:50.600-0400 D NETWORK  [conn2199] SocketException: remote: 127.0.0.1:64007 error: 9001 socket exception [CLOSED] server [127.0.0.1:64007] 
2015-09-10T18:31:50.600-0400 D NETWORK  [conn2200] SocketException: remote: 127.0.0.1:64009 error: 9001 socket exception [CLOSED] server [127.0.0.1:64009] 
2015-09-10T18:31:50.600-0400 I NETWORK  [conn2199] end connection 127.0.0.1:64007 (4 connections now open)
2015-09-10T18:31:50.601-0400 I NETWORK  [conn2200] end connection 127.0.0.1:64009 (4 connections now open)

Environment
OS X 10.8.4
MongoDB 3.0.4, WT
Replica Set with auth, authSchemaVersion 5



 Comments   
Comment by Andreas Nilsson [ 20/Apr/16 ]

Should be reverted when SERVER-9609 is resolved

Comment by Githook User [ 16/Sep/15 ]

Author:

{u'username': u'agralius', u'name': u'Andreas Nilsson', u'email': u'andreas.nilsson@10gen.com'}

Message: SERVER-20364 Let find privilege -> killCursor privilege
Branch: v3.0
https://github.com/mongodb/mongo/commit/5cb8c7d9810f15ba6926afe9d16d81eeb25724e5

Comment by Githook User [ 16/Sep/15 ]

Author:

{u'username': u'agralius', u'name': u'Andreas Nilsson', u'email': u'andreas.nilsson@10gen.com'}

Message: SERVER-20364 Let find privilege -> killCursor privilege
Branch: master
https://github.com/mongodb/mongo/commit/90d9a09c183b5dc6a83dfc30634e6807b393f457

Comment by J Rassi [ 11/Sep/15 ]

True, you'd get a truncated result set. I suspect that it doesn't make a difference for John's use case, though.

Comment by Kevin Pulo [ 11/Sep/15 ]

The negative limit workaround won't work if you need to fetch more than 1 batch, though.

Comment by John Morales [ 11/Sep/15 ]

Confirmed with both Go and Java reproducers that negative limit works around issue, thanks jason.rassi.

Comment by J Rassi [ 11/Sep/15 ]

Generally, cursors that are exhausted should be closed automatically by the server without a need for an explicit "kill cursors" request from the driver. However, queries that specify a positive limit will remain open even after the cursor.hasNext() call returns false, as the server does not know whether or not to interpret the positive ntoreturn value as a batchSize or a limit (edit: this is fixed in 3.2 with the introduction of the find command, which uses separate "batchSize" and "limit" options instead of ntoreturn). Thus, in this case, the driver will follow up with a "kill cursors" request, which should kill the cursor.

That aside, I've confirmed your suspicion from inspecting the server source code that users with the clusterMonitor role are allowed to create cursors on system.profile but are not allowed to kill them. I believe that this bug has existed since the clusterMonitor role was first introduced.

andreas.nilsson, can you triage? I'd suggest either adding ActionType::killCursors on "system.profile" to the out-argument in addClusterMonitorPrivileges() (and auditing the rest of role_graph_builtin_roles.cpp to fix all of the other places in this file that have the same problem), or getting rid of ActionType::killCursors entirely and changing killCursors/OP_KILL_CURSORS to require ActionType::find instead.

John: as a workaround, can you see if the problem still manifests if you change the limit to a negative limit? I suspect it won't, as the server should close the cursor automatically in this case.

Generated at Thu Feb 08 03:53:59 UTC 2024 using Jira 9.7.1#970001-sha1:2222b88b221c4928ef0de3161136cc90c8356a66.