[SERVER-44574] Queries Taking more then expected time for small collection Created: 12/Nov/19  Updated: 27/Oct/23  Resolved: 19/Nov/19

Status: Closed
Project: Core Server
Component/s: None
Affects Version/s: 3.4.23
Fix Version/s: None

Type: Question Priority: Major - P3
Reporter: Pratiksha Aggarwal Assignee: Dmitry Agranat
Resolution: Works as Designed Votes: 0
Labels: None
Remaining Estimate: Not Specified
Time Spent: Not Specified
Original Estimate: Not Specified

Backwards Compatibility: Fully Compatible
Participants:

 Description   

We have a collection on production environment and it is having only 126 records, but it is taking maximum 6 seconds for a single findOne operation. 

We are using mongodb version 3.4 , and it is having very slow response time for even smaller find operations. All the find operations are using IXSCAN with proper index utilisation, and all read and write operations are taking more than 1 second even with smaller document size having 10-12 fields.

We have 37GB of data and our production environment having 3 member replica set with primary, secondary, secondary configuration.

Please update us with possible optimisation methods.



 Comments   
Comment by Dmitry Agranat [ 19/Nov/19 ]

Hi pragupta, those write operations are coming from the Primary member, being replicated in batches to your secondary members. In case there is a read operation directed to one of your secondaries (for example, the query you've mentioned) and we need to replicate a batch of write operations from Primary, we will give priority to replicating this batch (while that read operation will have to wait until the write batch is fully replicated). So this is expected behavior before MongoDB 4.0.x.

If you see the same behavior after upgrading to MongoDB 4.x, please open a new ticket and we'd be happy to take a look.

Regards,
Dima

Comment by Pratiksha Aggarwal [ 19/Nov/19 ]

Hi @Dmitry Agranat

Our countryMapping collection is static collection which don't includes any write operation, only read is done in every 15min on 126 documents. So ideally 2-3 seconds is large time which makes the output slow for end user.   

Comment by Dmitry Agranat [ 19/Nov/19 ]

Hi pragupta, thanks for uploading the requested information.

This is a well-known issue of secondary queries where oplog application prevents reads in order to ensure a consistent view of the data. On secondary nodes, we apply writes in batches, because applying them sequentially would likely cause secondaries to fall behind the primary. When writes are applied in batches, we must block reads so that applications cannot see data applied in the "wrong" order. This is why when reading from secondaries, periodically the readers have to wait for replication batches to be applied.

The best way would be to upgrade to the latest MongoDB 4.0.x, which introduced non-blocking secondary reads.

Thanks,
Dima

Comment by Pratiksha Aggarwal [ 18/Nov/19 ]

@Dmitry Agranat  snippet is added now for our test environment on which queries are taking upto 2 seconds max, but on production they are taking 7seconds max, we have not uploaded production snippet  because we can't share that but you can look the test  environment logs which is equivalent to prod.

Comment by Dmitry Agranat [ 17/Nov/19 ]

Hi pragupta, the file from the diagnostic.data you've uploaded only covers a few hours of November 10th, while the latest log snippet is from November 11th. Also, we'll need a full mongod log (instead of a single snippet) together with an archive of the diagnostic.data directory.

Comment by Pratiksha Aggarwal [ 15/Nov/19 ]

Hi @Dmitry Agranat,

 

Above shared logs are from mongod.log , please find more logs for the 11th november and we have shared diagnostic data on secure portal as well for the same date.

 

2019-11-11T02:17:55.041+0000 I COMMAND [conn61586] command .countryMapping command: find { find: "countryMapping", filter: { $or: [

{ apFirmware.isoCode: "356" }

, { isoCode: "356" } ] }, projection: { _id: 0, countryCode: 1 }, limit: 1, singleBatch: true } planSummary: COLLSCAN keysExamined:0 docsExamined:48 cursorExhausted:1 numYields:0 nreturned:1 reslen:148 locks:{ Global: { acquireCount:

{ r: 2 }

, acquireWaitCount: { r: 1 }, timeAcquiringMicros: { r: 1222380 } }, Database: { acquireCount:

{ r: 1 }

}, Collection: { acquireCount:

{ r: 1 }

} } protocol:op_query 1223ms

2019-11-11T17:43:19.952+0000 I COMMAND [conn239731] command countryMapping command: find { find: "countryMapping", filter: { $or: [

{ apFirmware.isoCode: "840" }

, { isoCode: "840" } ] }, projection: { _id: 0, countryCode: 1 }, limit: 1, singleBatch: true } planSummary: COLLSCAN keysExamined:0 docsExamined:101 cursorExhausted:1 numYields:0 nreturned:1 reslen:148 locks:{ Global: { acquireCount:

{ r: 2 }

, acquireWaitCount: { r: 1 }, timeAcquiringMicros: { r: 1055145 } }, Database: { acquireCount:

{ r: 1 }

}, Collection: { acquireCount:

{ r: 1 }

} } protocol:op_query 1058ms

 

 

 

Comment by Dmitry Agranat [ 13/Nov/19 ]

Hi pragupta,

I did not find the requested mongod log and an archive of the diagnostic.data directory from the server in question in our secure upload portal. Could you make sure it's uploaded and notify here when it's done?

Thanks,
Dima

Comment by Pratiksha Aggarwal [ 13/Nov/19 ]

Please find the slow query log, total number of records in countryMapping collection are 126. This query is observed daily in logs in gap of every 2min on production environment.

2019-10-24T05:01:51.395+0000 I COMMAND [conn31403628] command countryMapping command: find { find: "countryMapping", filter: { $or: [{ apFirmware.isoCode: "840" }, \{ isoCode: "840" } ] }, projection: \{ _id: 0, countryCode: 1 }, limit: 1, singleBatch: true } planSummary: COLLSCAN keysExamined:0 docsExamined:89 cursorExhausted:1 keyUpdates:0 writeConflicts:0 numYields:0 nreturned:1 reslen:166 locks:{ Global: { acquireCount: { r: 2 }, acquireWaitCount: \{ r: 1 }, timeAcquiringMicros: \{ r: 6784024 } }, Database: { acquireCount: { r: 1 }}, Collection: { acquireCount: { r: 1 }} } protocol:op_query 6808ms

2019-10-24T05:33:17.376+0000 I COMMAND [conn29565004] command countryMapping command: find { find: "countryMapping", filter: { timezoneDetails.apCode: "262" }, projection: \{ timezoneDetails.$: 1 } } planSummary: IXSCAN \{ timezoneDetails.apCode: 1.0 } keysExamined:1 docsExamined:1 cursorExhausted:1 keyUpdates:0 writeConflicts:0 numYields:0 nreturned:1 reslen:358 locks:{ Global: { acquireCount: { r: 2 } , acquireWaitCount: \{ r: 1 }, timeAcquiringMicros: \{ r: 2959573 } }, Database: { acquireCount: { r: 1 }}, Collection: { acquireCount: { r: 1 }} } protocol:op_query 2995ms

 

Comment by Dmitry Agranat [ 12/Nov/19 ]

Hi pragupta,

Would you please upload the mongod log and an archive of the diagnostic.data directory (the contents are described here) in your dbpath to this secure upload portal that is only visible to MongoDB employees?

Please also note the time and the time zone of any slow operations you'd like us to investigate.

Thanks,
Dima

Generated at Thu Feb 08 05:06:21 UTC 2024 using Jira 9.7.1#970001-sha1:2222b88b221c4928ef0de3161136cc90c8356a66.