-
Type: Bug
-
Resolution: Unresolved
-
Priority: Major - P3
-
None
-
Affects Version/s: 4.4.14, 4.4.17, 4.4.18
-
Component/s: None
-
None
-
Cluster Scalability
-
ALL
-
When I deploy a cluster higher than version 4.4.14,I find that there will be a freeze when executing the enableSharding or shardCollection command.But when I restart the configserver master node, the lag disappeared. The log looks like this.
I find that when executing the shardCollection command, the shardserver node will read some collections in the config database of the configserver, such as config.collections, and these operations will spend a lot of time. I added some logs to the code to record the time spent by configserver executing the following functions in service_entry_point_common.cpp.
behaviors.waitForReadConcern(opCtx, invocation.get(), request); behaviors.setPrepareConflictBehaviorForReadConcern(opCtx, invocation.get());
I got the following log.
{"t":{"$date":"2023-02-10T18:45:25.552+08:00"},"s":"I", "c":"COMMAND", "id":1133440, "ctx":"conn29","msg":"Reach there to start waiting for read Concern!!!!"} {"t":{"$date":"2023-02-10T18:45:27.452+08:00"},"s":"I", "c":"COMMAND", "id":1133441, "ctx":"conn29","msg":"Reach there to end waiting for read Concern!!!!"} {"t":{"$date":"2023-02-10T18:45:27.453+08:00"},"s":"I", "c":"COMMAND", "id":51803, "ctx":"conn29","msg":"Slow query","attr":{"type":"command","ns":"config.collections","command":{"find":"collections","filter":{"_id":"netease.ffcccccciiii"},"readConcern":{"level":"majority","afterOpTime":{"ts":{"$timestamp":{"t":1676028592,"i":4}},"t":1}},"limit":1,"maxTimeMS":30000,"$readPreference":{"mode":"nearest"},"$replData":1,"$clusterTime":{"clusterTime":{"$timestamp":{"t":1676028592,"i":4}},"signature":{"hash":{"$binary":{"base64":"xEAj8PuCgaD58iDzrXetLftM4nU=","subType":"0"}},"keyId":7198487474404851735}},"$configServerState":{"opTime":{"ts":{"$timestamp":{"t":1676028592,"i":4}},"t":1}},"$db":"config"},"planSummary":"IDHACK","keysExamined":1,"docsExamined":1,"cursorExhausted":true,"numYields":0,"nreturned":1,"reslen":769,"locks":{"ReplicationStateTransition":{"acquireCount":{"w":1}},"Global":{"acquireCount":{"r":1}},"Database":{"acquireCount":{"r":1}},"Collection":{"acquireCount":{"r":1}},"Mutex":{"acquireCount":{"r":1}}},"readConcern":{"level":"majority","afterOpTime":{"ts":{"$timestamp":{"t":1676028592,"i":4}},"t":1},"provenance":"clientSupplied"},"storage":{},"protocol":"op_msg","durationMillis":1900}}
- related to
-
SERVER-58721 processReplSetInitiate does not set a stableTimestamp or take a stable checkpoint
- Closed