[SERVER-34517] getMore in session while running with TLS fails Created: 17/Apr/18  Updated: 29/Oct/23  Resolved: 01/Jun/18

Status: Closed
Project: Core Server
Component/s: Querying, Security, Sharding
Affects Version/s: 4.0.0-rc0
Fix Version/s: 4.0.0-rc5, 4.1.1

Type: Bug Priority: Major - P3
Reporter: Charlie Swanson Assignee: Misha Tyulenev
Resolution: Fixed Votes: 0
Labels: None
Remaining Estimate: Not Specified
Time Spent: Not Specified
Original Estimate: Not Specified

Attachments: File repro.js    
Issue Links:
Backports
Depends
is depended on by JAVA-2834 Re-enable tests after SERVER-34517 is... Closed
Duplicate
is duplicated by SERVER-35276 "Cannot run getMore" on sharded clust... Closed
Related
related to SERVER-35323 sessionId matching ignores userId par... Closed
is related to JAVA-2831 If batch size is 0 for listCollection... Closed
Backwards Compatibility: Fully Compatible
Operating System: ALL
Backport Requested:
v4.0
Steps To Reproduce:
  1. Download attached repro.js.
  2. Run the following

    python buildscripts/resmoke.py --suites=ssl repro.js
    

The test doesn't seem to reproduce if the sharded cluster is not set up with SSL/TLS. It also seems to only impact listIndexes cursors, it does not affect find cursors or even aggregate cursors which are also globally managed.

The test is not fixed by applying the patch for SERVER-34204.

Sprint: Sharding 2018-06-04
Participants:

 Description   

When executing a query as part of a session with TLS enabled the following error is encountered:

[js_test:repro] 2018-04-17T11:08:01.187-0400 2018-04-17T11:08:01.187-0400 E QUERY    [js] Error: getMore command failed: {
[js_test:repro] 2018-04-17T11:08:01.187-0400 	"ok" : 0,
[js_test:repro] 2018-04-17T11:08:01.187-0400 	"errmsg" : "Cannot run getMore on cursor 4981479160386133211, which was created in session 5da04883-17b4-4bbc-94f4-0f60a813cb17 - u4nTF1+wmByGgmwndZCCo3FgRx9gUEtGEkFRhsYwq3A=, in session 5da04883-17b4-4bbc-94f4-0f60a813cb17 - O0CMtIVItQN4IsEOsJdrPL8s7jv5xwh5a/A5Qfvs2A8=",
[js_test:repro] 2018-04-17T11:08:01.187-0400 	"code" : 50738,
[js_test:repro] 2018-04-17T11:08:01.187-0400 	"codeName" : "Location50738",
[js_test:repro] 2018-04-17T11:08:01.187-0400 	"$clusterTime" : {
[js_test:repro] 2018-04-17T11:08:01.187-0400 		"clusterTime" : Timestamp(1523977681, 8),
[js_test:repro] 2018-04-17T11:08:01.187-0400 		"signature" : {
[js_test:repro] 2018-04-17T11:08:01.188-0400 			"hash" : BinData(0,"5Sn0WMbpAp0oecdpFLyUH8/dhZg="),
[js_test:repro] 2018-04-17T11:08:01.188-0400 			"keyId" : NumberLong("6545434278254084125")
[js_test:repro] 2018-04-17T11:08:01.188-0400 		}
[js_test:repro] 2018-04-17T11:08:01.188-0400 	},
[js_test:repro] 2018-04-17T11:08:01.188-0400 	"operationTime" : Timestamp(1523977681, 8)
[js_test:repro] 2018-04-17T11:08:01.188-0400 } :



 Comments   
Comment by Githook User [ 07/Jun/18 ]

Author:

{'username': 'mikety', 'name': 'Misha Tyulenev', 'email': 'misha@mongodb.com'}

Message: SERVER-34517 do not check for userId in cursors within sessions during getMore in sharded cluster

(cherry picked from commit 5eb20d1ed6d5fd852b2192450dadbae0eec33278)
Branch: v4.0
https://github.com/mongodb/mongo/commit/6ac3162f47fd9e114c0a43f3f92ba1b7aa468bcd

Comment by Githook User [ 01/Jun/18 ]

Author:

{'username': 'mikety', 'name': 'Misha Tyulenev', 'email': 'misha@mongodb.com'}

Message: SERVER-34517 do not check for userId in cursors within sessions during getMore in sharded cluster
Branch: master
https://github.com/mongodb/mongo/commit/5eb20d1ed6d5fd852b2192450dadbae0eec33278

Comment by Misha Tyulenev [ 11/May/18 ]

Discussed with Jeff - from the current driver's test it only affects listIndexes and listCollections commands. Will take a look

Comment by Kaloian Manassiev [ 11/May/18 ]

Given Charlie's comment above, where the SHA256 signature doesn't match, passing this on to the Platforms team.

Comment by Charlie Swanson [ 17/Apr/18 ]

This also appears to affect the listCollections command.

Comment by Charlie Swanson [ 17/Apr/18 ]

Actually, something suspicious in the log output. It looks like mongos actually did send the getMore with the lsid attached:

[js_test:repro] 2018-04-17T11:15:14.984-0400 s30025| 2018-04-17T11:15:14.983-0400 I COMMAND  [conn4] command test.$cmd.listIndexes.foo appName: "MongoDB Shell" command: getMore { getMore: 8531561727155475579, collection: "$cmd.listIndexes.foo", lsid: { id: UUID("b8246022-983b-4daf-a9f0-67eabec8775d") }, $clusterTime: { clusterTime: Timestamp(1523978114, 34), signature: { hash: BinData(0, 5B809729D1CF61582707A1B52B56431AB3E10A0D), keyId: 6545436142269890589 } }, $db: "test" } originatingCommand: { listIndexes: "foo", cursor: { batchSize: 0.0 }, lsid: { id: UUID("b8246022-983b-4daf-a9f0-67eabec8775d") }, $clusterTime: { clusterTime: Timestamp(1523978114, 34), signature: { hash: BinData(0, 5B809729D1CF61582707A1B52B56431AB3E10A0D), keyId: 6545436142269890589 } }, $db: "test" } nShards:1 cursorid:8531561727155475579 numYields:0 reslen:472 protocol:op_msg 6ms

But it looks like mongod agrees on the session id, but not the SHA256 block? It looks like the two pieces printed below (separated by the "-") correspond to the UUID and the SHA256Block, correspondingly - assuming we're using this StringBuilder operator<<(StringBuilder&, LocialSessionId&).

[js_test:repro] 2018-04-17T11:15:14.984-0400 2018-04-17T11:15:14.984-0400 E QUERY    [js] Error: getMore command failed: {
[js_test:repro] 2018-04-17T11:15:14.985-0400 	"ok" : 0,
[js_test:repro] 2018-04-17T11:15:14.985-0400 	"errmsg" : "Cannot run getMore on cursor 6939087395459355561, which was created in session b8246022-983b-4daf-a9f0-67eabec8775d - u4nTF1+wmByGgmwndZCCo3FgRx9gUEtGEkFRhsYwq3A=, in session b8246022-983b-4daf-a9f0-67eabec8775d - O0CMtIVItQN4IsEOsJdrPL8s7jv5xwh5a/A5Qfvs2A8=",

I'm sending this over to sharding for investigation.

Generated at Thu Feb 08 04:36:56 UTC 2024 using Jira 9.7.1#970001-sha1:2222b88b221c4928ef0de3161136cc90c8356a66.