[SERVER-30904] Failed to load chunks due to operation exceeded time limit Created: 31/Aug/17  Updated: 07/Nov/17  Resolved: 29/Sep/17

Status: Closed
Project: Core Server
Component/s: None
Affects Version/s: None
Fix Version/s: None

Type: Question Priority: Major - P3
Reporter: Ganeshkumar Assignee: Mark Agarunov
Resolution: Incomplete Votes: 0
Labels: None
Remaining Estimate: Not Specified
Time Spent: Not Specified
Original Estimate: Not Specified

Participants:

 Description   

Hello,

We are facing Operation timed out and serverStatus was very slow issue, while accessing a collection.

Following are the errors,
a) db.collection.stats()

     {
    "code" : 50,
    "ok" : 0,
    "errmsg" : "Failed to load chunks due to operation exceeded time limit"
     }

b) Excerpt Config server error log,

2017-08-30T08:46:23.418+0000 I NETWORK  [replSetDistLockPinger] Marking host host:port as failed :: caused by :: ExceededTimeLimit: Operation timed out, request was RemoteCommand 5513 -- target:host:port db:config expDate:2017-08-30T08:46:23.418+0000 cmd:{ findAndModify: "lockpings", query: { _id: "Hostname.prd-us.burninglass.co.us:port:1504077382:3669026758099117973" }, update: { $set: { ping: new Date(1504082753418) } }, upsert: true, writeConcern: { w: "majority", wtimeout: 15000 }, maxTimeMS: 30000 }
2017-08-30T08:46:23.418+0000 I SHARDING [replSetDistLockPinger] Operation timed out with status ExceededTimeLimit: Operation timed out, request was RemoteCommand 5513 -- target:host:port db:config expDate:2017-08-30T08:46:23.418+0000 cmd:{ findAndModify: "lockpings", query: { _id: "Hostname.prd-us.burninglass.co.us:port:1504077382:3669026758099117973" }, update: { $set: { ping: new Date(1504082753418) } }, upsert: true, writeConcern: { w: "majority", wtimeout: 15000 }, maxTimeMS: 30000 }
2017-08-30T08:46:23.418+0000 W SHARDING [replSetDistLockPinger] pinging failed for distributed lock pinger :: caused by :: ExceededTimeLimit: Operation timed out, request was RemoteCommand 5513 -- target:host:port db:config expDate:2017-08-30T08:46:23.418+0000 cmd:{ findAndModify: "lockpings", query: { _id: "Hostname.prd-us.burninglass.co.us:port:1504077382:3669026758099117973" }, update: { $set: { ping: new Date(1504082753418) } }, upsert: true, writeConcern: { w: "majority", wtimeout: 15000 }, maxTimeMS: 30000 }
 
2017-08-30T08:46:46.008+0000 I ASIO     [NetworkInterfaceASIO-ShardRegistry-0] Ending connection to host host:port due to bad connection status; 1 connections to that host remain open
2017-08-30T08:46:46.008+0000 I SHARDING [Uptime reporter] Operation timed out with status ExceededTimeLimit: Operation timed out, request was RemoteCommand 5527 -- target:host:port db:config expDate:2017-08-30T08:46:46.008+0000 cmd:{ update: "mongos", updates: [ { q: { id: "Hostname.prd-us.burninglass.co.us:port" }, u: { $set: { id: "Hostname.prd-us.burninglass.co.us:port", ping: new Date(1504082776006), up: 5393, waiting: true, mongoVersion: "3.4.6" } }, multi: false, upsert: true } ], writeConcern: { w: "majority", wtimeout: 15000 }, maxTimeMS: 30000 }
 
2017-08-30T13:21:06.004+0000 I COMMAND  [ftdc] serverStatus was very slow: { after basic: 0, after asserts: 0, after backgroundFlushing: 0, after connections: 0, after dur: 0, after extra_info: 3000, after globalLock: 3000, after locks: 3000, after network: 3000, after opLatencies: 3000, after opcounters: 3000, after opcountersRepl: 3000, after repl: 3000, after security: 3000, after storageEngine: 3000, after tcmalloc: 3000, after wiredTiger: 3000, at end: 3000 }
2017-08-30T14:34:14.305+0000 I COMMAND  [ftdc] serverStatus was very slow: { after basic: 0, after asserts: 0, after backgroundFlushing: 0, after connections: 0, after dur: 0, after extra_info: 3276, after globalLock: 3276, after locks: 3276, after network: 3276, after opLatencies: 3276, after opcounters: 3276, after opcountersRepl: 3276, after repl: 3276, after security: 3276, after storageEngine: 3276, after tcmalloc: 3276, after wiredTiger: 3276, at end: 3276 }
 
2017-08-31T03:35:33.217+0000 I NETWORK  [CatalogCacheLoader-10] Marking host host:port as failed :: caused by :: ExceededTimeLimit: operation exceeded time limit
2017-08-31T03:35:33.217+0000 I SHARDING [CatalogCacheLoader-10] Operation timed out  :: caused by :: ExceededTimeLimit: operation exceeded time limit
2017-08-31T03:35:33.217+0000 I SHARDING [CatalogCacheLoader-10] Refresh for collection Database.Collection took 31318 ms and failed :: caused by :: ExceededTimeLimit: Failed to load chunks due to operation exceeded time limit
2017-08-31T04:54:50.208+0000 I SHARDING [CatalogCacheLoader-533] Refresh for collection Database.Collection took 116480 ms and found version 5146|6791||58b85d2868d1121a9ec38cd1



 Comments   
Comment by Kelsey Schubert [ 29/Sep/17 ]

Hi bgt-ganeshkumar,

We haven’t heard back from you for some time, so I’m going to mark this ticket as resolved. If this is still an issue for you, please provide additional information and we will reopen the ticket.

Regards,
Kelsey

Comment by Ramon Fernandez Marina [ 13/Sep/17 ]

bgt-ganeshkumar, we haven't heard back from you for some time. If this is still an issue for you, can you please provide the information requested above by Mark so we can investigate?

Thanks,
Ramón.

Comment by Mark Agarunov [ 31/Aug/17 ]

Hello bgt-ganeshkumar,

Thank you for the report. To further investigate what may be causing this behavior, please provide the following:

  • Archive and upload the $dbpath/diagnostic.data directory
  • The complete log files from the affected mongod nodes.

Thanks,
Mark

Generated at Thu Feb 08 04:25:23 UTC 2024 using Jira 9.7.1#970001-sha1:2222b88b221c4928ef0de3161136cc90c8356a66.