[SERVER-53228] 4.4.2 Sharded clusters using nearest has difference explain plan from primary and secondary Created: 04/Dec/20  Updated: 15/Mar/21  Resolved: 15/Mar/21

Status: Closed
Project: Core Server
Component/s: Diagnostics
Affects Version/s: None
Fix Version/s: None

Type: Bug Priority: Major - P3
Reporter: jing xu Assignee: Eric Sedor
Resolution: Done Votes: 0
Labels: None
Remaining Estimate: Not Specified
Time Spent: Not Specified
Original Estimate: Not Specified

Issue Links:
Related
Operating System: ALL
Participants:

 Description   

hi:

  i have mongodb 4.4.2 Sharded clusters including one shard.i test nearest.when sql direct to primary using IXSCAN,secondary using COUNT_SCAN.the following is my test:

mongos> db.getMongo().setReadPref('nearest')
mongos> db.POCCOLL.count({fld3:"invidunt ut labore et dolore"})
2695802
mongos>

 

from primary logs:

 
{"t":\{"$date":"2020-12-04T17:15:07.841+08:00"},"s":"I", "c":"COMMAND", "id":51803, "ctx":"conn1969","msg":"Slow query","attr":\{"type":"command","ns":"POCDB.POCCOLL","appName":"MongoDB Shell","command":{"count":"POCCOLL","query":{"fld3":"invidunt ut labore et dolore"},"readConcern":\{"provenance":"implicitDefault"},"shardVersion":[\{"$timestamp":{"t":1,"i":2894}},\{"$oid":"5fc737fe7209449001f7369d"}],"clientOperationKey":\{"$uuid":"f0fa77d1-9ebb-40fa-bad5-fbc18251052a"},"lsid":\{"id":{"$uuid":"3acac26a-0f29-414a-a339-b8b43cc66e69"},"uid":\{"$binary":{"base64":"O0CMtIVItQN4IsEOsJdrPL8s7jv5xwh5a/A5Qfvs2A8=","subType":"0"}}},"maxTimeMSOpOnly":120,"$readPreference":\{"mode":"nearest","hedge":{"enabled":true}},"$clusterTime":\{"clusterTime":{"$timestamp":{"t":1607073306,"i":1}},"signature":\{"hash":{"$binary":{"base64":"XszqboCQ7DhMofAB6xOzk39BdJk=","subType":"0"}},"keyId":6901213159848149013}},"$audit":\{"$impersonatedUsers":[{"user":"admin","db":"admin"}],"$impersonatedRoles":[\{"role":"root","db":"admin"}]},"$client":\{"application":{"name":"MongoDB Shell"},"driver":\{"name":"MongoDB Internal Client","version":"4.4.2"},"os":\{"type":"Linux","name":"CentOS Linux release 7.4.1708 (Core) ","architecture":"x86_64","version":"Kernel 3.10.0-693.21.1.el7.x86_64"},"mongos":\{"host":"vmt20130:31051","client":"10.130.9.150:60544","version":"4.4.2"}},"$configServerState":\{"opTime":{"ts":{"$timestamp":{"t":1607073306,"i":1}},"t":3}},"$db":"POCDB"},"planSummary":"IXSCAN \{ fld3: 1 }","numYields":7,"queryHash":"97EEEEB2","planCacheKey":"EC620B13","ok":0,"errMsg":"Exec error resulting in state FAILURE :: caused by :: operation exceeded time limit","errName":"MaxTimeMSExpired","errCode":50,"reslen":450,"locks":\{"ReplicationStateTransition":{"acquireCount":{"w":8}},"Global":\{"acquireCount":{"r":8}},"Database":\{"acquireCount":{"r":8}},"Collection":\{"acquireCount":{"r":8}},"Mutex":\{"acquireCount":{"r":3}}},"readConcern":\{"provenance":"implicitDefault"},"protocol":"op_msg","durationMillis":148}}

 

 

from secondary logs:

 
{"t":\{"$date":"2020-12-04T17:15:08.812+08:00"},"s":"I", "c":"COMMAND", "id":51803, "ctx":"conn1913","msg":"Slow query","attr":\{"type":"command","ns":"POCDB.POCCOLL","appName":"MongoDB Shell","command":{"count":"POCCOLL","query":{"fld3":"invidunt ut labore et dolore"},"readConcern":\{"provenance":"implicitDefault"},"shardVersion":[\{"$timestamp":{"t":1,"i":2894}},\{"$oid":"5fc737fe7209449001f7369d"}],"clientOperationKey":\{"$uuid":"f0fa77d1-9ebb-40fa-bad5-fbc18251052a"},"lsid":\{"id":{"$uuid":"3acac26a-0f29-414a-a339-b8b43cc66e69"},"uid":\{"$binary":{"base64":"O0CMtIVItQN4IsEOsJdrPL8s7jv5xwh5a/A5Qfvs2A8=","subType":"0"}}},"$readPreference":\{"mode":"nearest","hedge":{"enabled":true}},"$clusterTime":\{"clusterTime":{"$timestamp":{"t":1607073306,"i":1}},"signature":\{"hash":{"$binary":{"base64":"XszqboCQ7DhMofAB6xOzk39BdJk=","subType":"0"}},"keyId":6901213159848149013}},"$audit":\{"$impersonatedUsers":[{"user":"admin","db":"admin"}],"$impersonatedRoles":[\{"role":"root","db":"admin"}]},"$client":\{"application":{"name":"MongoDB Shell"},"driver":\{"name":"MongoDB Internal Client","version":"4.4.2"},"os":\{"type":"Linux","name":"CentOS Linux release 7.4.1708 (Core) ","architecture":"x86_64","version":"Kernel 3.10.0-693.21.1.el7.x86_64"},"mongos":\{"host":"vmt20130:31051","client":"10.130.9.150:60544","version":"4.4.2"}},"$configServerState":\{"opTime":{"ts":{"$timestamp":{"t":1607073306,"i":1}},"t":3}},"$db":"POCDB"},"planSummary":"COUNT_SCAN \{ fld3: 1 }","keysExamined":2695803,"docsExamined":0,"numYields":2695,"queryHash":"97EEEEB2","planCacheKey":"EC620B13","reslen":320,"locks":\{"ReplicationStateTransition":{"acquireCount":{"w":2696}},"Global":\{"acquireCount":{"r":2696}},"Database":\{"acquireCount":{"r":2696}},"Collection":\{"acquireCount":{"r":2696}},"Mutex":\{"acquireCount":{"r":1}}},"readConcern":\{"provenance":"implicitDefault"},"storage":\{"data":{"bytesRead":10114025,"timeReadingMicros":18241}},"protocol":"op_msg","durationMillis":1120}}



 Comments   
Comment by Eric Sedor [ 15/Mar/21 ]

Hi,

We haven’t heard back from you for some time, so I’m going to close this ticket. If this is still an issue for you, please clarify how else we can help and we can reopen the ticket.

Regards,
Eric

Comment by Eric Sedor [ 16/Dec/20 ]

Hi 601290552@qq.com,

Currently this is expected behavior for readConcern "available", which is the default readConcern for Secondary reads that aren't part of causally consistent sessions. The "available" readConcern allows the return of orphan documents, which allows COUNT_SCAN to be used. By contrast, the Primary read uses a readConcern of "local" and a plan involving an IXSCAN (as of SERVER-3645), so as to filter orphaned documents.

You should see similar plan selection if you explicitly specify a readConcern of "local".

Does this make sense?

Comment by jing xu [ 04/Dec/20 ]

mongos:
mongos> sh.status();
— Sharding Status —
sharding version:

{ "_id" : 1, "minCompatibleVersion" : 5, "currentVersion" : 6, "clusterId" : ObjectId("5fc608cefcbcbec36d4f785d") }

shards:

{ "_id" : "shard1", "host" : "shard1/10.130.10.149:31001,10.130.10.150:31001,10.130.9.150:31001", "state" : 1 }

active mongoses:
"4.4.2" : 1
autosplit:
Currently enabled: yes
balancer:
Currently enabled: yes
Currently running: no
Failed balancer rounds in last 5 attempts: 5
Last reported error: Couldn't get a connection within the time limit of 4ms
Time of Reported error: Wed Dec 02 2020 15:27:05 GMT+0800 (CST)
Migration Results for the last 24 hours:
No recent migrations
databases:
{ "_id" : "POCDB", "primary" : "shard1", "partitioned" : true, "version" :

{ "uuid" : UUID("84c6bc81-017f-4174-a59a-72551083b7c4"), "lastMod" : 1 }

}
POCDB.POCCOLL
shard key:

{ "_id" : 1 }

unique: false
balancing: true
chunks:
shard1 1448
too many chunks to print, use verbose if you want to force print

{ "_id" : "config", "primary" : "config", "partitioned" : true }

config.system.sessions
shard key:

{ "_id" : 1 }

unique: false
balancing: true
chunks:
shard1 1024
too many chunks to print, use verbose if you want to force print

from primary logs:

keysExamined":1921679,"docsExamined":1921679

{"t":\{"$date":"2020-12-04T15:48:28.824+08:00"},"s":"I", "c":"COMMAND", "id":51803, "ctx":"conn1944","msg":"Slow query","attr":\{"type":"command","ns":"POCDB.POCCOLL","appName":"MongoDB Shell","command":{"count":"POCCOLL","query":{"fld3":"voluptua. At vero eos et"},"readConcern":\{"provenance":"implicitDefault"},"shardVersion":[\{"$timestamp":{"t":1,"i":2894}},\{"$oid":"5fc737fe7209449001f7369d"}],"clientOperationKey":\{"$uuid":"b3f975da-b930-40ac-82b6-e95933367fbe"},"lsid":\{"id":{"$uuid":"3acac26a-0f29-414a-a339-b8b43cc66e69"},"uid":\{"$binary":{"base64":"O0CMtIVItQN4IsEOsJdrPL8s7jv5xwh5a/A5Qfvs2A8=","subType":"0"}}},"$readPreference":\{"mode":"nearest","hedge":{"enabled":true}},"$clusterTime":\{"clusterTime":{"$timestamp":{"t":1607068038,"i":1}},"signature":\{"hash":{"$binary":{"base64":"+QDnZt9ACtbP1rZ40QDD+GAjF4Q=","subType":"0"}},"keyId":6901213159848149013}},"$audit":\{"$impersonatedUsers":[{"user":"admin","db":"admin"}],"$impersonatedRoles":[\{"role":"root","db":"admin"}]},"$client":\{"application":{"name":"MongoDB Shell"},"driver":\{"name":"MongoDB Internal Client","version":"4.4.2"},"os":\{"type":"Linux","name":"CentOS Linux release 7.4.1708 (Core) ","architecture":"x86_64","version":"Kernel 3.10.0-693.21.1.el7.x86_64"},"mongos":\{"host":"vmt20130:31051","client":"10.130.9.150:60544","version":"4.4.2"}},"$configServerState":\{"opTime":{"ts":{"$timestamp":{"t":1607068038,"i":1}},"t":3}},"$db":"POCDB"},"planSummary":"IXSCAN \{ fld3: 1 }","keysExamined":1921679,"docsExamined":1921679,"numYields":3685,"queryHash":"97EEEEB2","planCacheKey":"EC620B13","reslen":340,"locks":\{"ReplicationStateTransition":{"acquireCount":{"w":3686}},"Global":\{"acquireCount":{"r":3686}},"Database":\{"acquireCount":{"r":3686}},"Collection":\{"acquireCount":{"r":3686}},"Mutex":\{"acquireCount":{"r":3}}},"readConcern":\{"provenance":"implicitDefault"},"storage":\{"data":{"bytesRead":46099281443,"timeReadingMicros":55325725},"timeWaitingMicros":\{"cache":6134}},"protocol":"op_msg","durationMillis":69229}}

from secondary logs:
"keysExamined":2695803,"docsExamined":0,

{"t":{"$date":"2020-12-04T17:15:08.812+08:00"},"s":"I",  "c":"COMMAND",  "id":51803,   "ctx":"conn1913","msg":"Slow query","attr":{"type":"command","ns":"POCDB.POCCOLL","appName":"MongoDB Shell","command":{"count":"POCCOLL","query":{"fld3":"invidunt ut labore et dolore"},"readConcern":{"provenance":"implicitDefault"},"shardVersion":[{"$timestamp":{"t":1,"i":2894}},{"$oid":"5fc737fe7209449001f7369d"}],"clientOperationKey":{"$uuid":"f0fa77d1-9ebb-40fa-bad5-fbc18251052a"},"lsid":{"id":{"$uuid":"3acac26a-0f29-414a-a339-b8b43cc66e69"},"uid":{"$binary":{"base64":"O0CMtIVItQN4IsEOsJdrPL8s7jv5xwh5a/A5Qfvs2A8=","subType":"0"}}},"$readPreference":{"mode":"nearest","hedge":{"enabled":true}},"$clusterTime":{"clusterTime":{"$timestamp":{"t":1607073306,"i":1}},"signature":{"hash":{"$binary":{"base64":"XszqboCQ7DhMofAB6xOzk39BdJk=","subType":"0"}},"keyId":6901213159848149013}},"$audit":{"$impersonatedUsers":[{"user":"admin","db":"admin"}],"$impersonatedRoles":[{"role":"root","db":"admin"}]},"$client":{"application":{"name":"MongoDB Shell"},"driver":{"name":"MongoDB Internal Client","version":"4.4.2"},"os":{"type":"Linux","name":"CentOS Linux release 7.4.1708 (Core) ","architecture":"x86_64","version":"Kernel 3.10.0-693.21.1.el7.x86_64"},"mongos":{"host":"vmt20130:31051","client":"10.130.9.150:60544","version":"4.4.2"}},"$configServerState":{"opTime":{"ts":{"$timestamp":{"t":1607073306,"i":1}},"t":3}},"$db":"POCDB"},"planSummary":"COUNT_SCAN { fld3: 1 }","keysExamined":2695803,"docsExamined":0,"numYields":2695,"queryHash":"97EEEEB2","planCacheKey":"EC620B13","reslen":320,"locks":{"ReplicationStateTransition":{"acquireCount":{"w":2696}},"Global":{"acquireCount":{"r":2696}},"Database":{"acquireCount":{"r":2696}},"Collection":{"acquireCount":{"r":2696}},"Mutex":{"acquireCount":{"r":1}}},"readConcern":{"provenance":"implicitDefault"},"storage":{"data":{"bytesRead":10114025,"timeReadingMicros":18241}},"protocol":"op_msg","durationMillis":1120}}

so is it a bug ?

Generated at Thu Feb 08 05:30:18 UTC 2024 using Jira 9.7.1#970001-sha1:2222b88b221c4928ef0de3161136cc90c8356a66.