[SERVER-24833] Certain queries are 2000x time slower with WiredTiger than with MMAPv1 Created: 29/Jun/16  Updated: 04/Oct/16  Resolved: 03/Oct/16

Status: Closed
Project: Core Server
Component/s: WiredTiger
Affects Version/s: 3.2.7
Fix Version/s: None

Type: Bug Priority: Major - P3
Reporter: Edgar Costa Assignee: Kelsey Schubert
Resolution: Cannot Reproduce Votes: 3
Labels: None
Remaining Estimate: Not Specified
Time Spent: Not Specified
Original Estimate: Not Specified

Attachments: File ticket.tar    
Issue Links:
Duplicate
Related
related to WT-2670 Inefficient I/O when read full DB (po... Closed
Operating System: ALL
Steps To Reproduce:
  1. Start 2 monodb servers, one with mmpav1 and the other with wiredTiger.
  2. Dowload:
  3. restore that collection, and perform the query on both servers.
Participants:

 Description   

In certain queries, where there isn't an index available, wiredTiger takes way too long, when comparison with MMAPv1.
In the example below the difference is 6min vs 0.2 sec.

Example:

db.forms.find({'dimension': 2,'degree':2 }).count()
 
And, there isn't any entry with the key "degree".

wiredTiger:
config:

  engine: 'wiredTiger'
  wiredTiger:
    collectionConfig: 
      blockCompressor: none

We also have tried with snappy and zlib.
The filesystem is ext4, and we have also tried xfs without a significant difference (when compared to MMAPv1).

log entry:

2016-06-28T05:09:22.454+0000 I COMMAND  [conn4] command hmfs.forms command: count { count: "forms", query: { dimension: 2, degree: 2 } } planSummary: IXSCAN { dimension: -1, field_label: 1, level_norm: -1 } fromMultiPlanner:1 keyUpdates:0 writeConflicts:0 numYields:13831 reslen:62 locks:{ Global: { acquireCount: { r: 27664 } }, Database: { acquireCount: { r: 13832 } }, Collection: { acquireCount: { r: 13832 } } } protocol:op_query 369464ms

MMAPv1
config:

engine: 'mmapv1'

log entry:

2016-06-28T05:10:12.412+0000 I COMMAND  [conn15] command hmfs.forms command: count { count: "forms", query: { dimension: 2, degree: 2 } } planSummary: IXSCAN { dimension: -1, field_label: 1, level_norm: -1 } keyUpdates:0 writeConflicts:0 numYields:447 reslen:62 locks:{ Global: { acquireCount: { r: 896 } }, MMAPV1Journal: { acquireCount: { r: 448 } }, Database: { acquireCount: { r: 448 } }, Collection: { acquireCount: { R: 448 } } } protocol:op_query 181ms 

This two examples were run on the same machine, on different empty dbs where the only difference is the storage engine. In both cases, the collections was restored from the same dump file.

Perhaps we are doing something wrong, if so, please enlighten us, as we are planning switch all our servers to MMAPv1.



 Comments   
Comment by Kelsey Schubert [ 03/Oct/16 ]

Hi edgarcosta,

We have not been able to reproduce this issue. I see that in your logs after the WiredTiger cache is hot, the query takes ~600ms to complete. As you know, MMAPv1 relies on the filesystem cache, whereas WiredTiger utilizes both its own cache and the filesystem cache. This difference may explain the results you are observing.

Kind regards,
Thomas

Comment by Edgar Costa [ 01/Jul/16 ]

Thank you.

Also, I did these experiments on a "n1-highmem-4" machine from google compute engine, i.e., 4 cpu with 26 GB of memory.
Further, mongo was installed directly from the mongodb repository:
"http://repo.mongodb.org/apt/ubuntu trusty/mongodb-org/3.2"

mongodb-org/trusty,now 3.2.7 amd64 [installed]
mongodb-org-mongos/trusty,now 3.2.7 amd64 [installed,automatic]
mongodb-org-server/trusty,now 3.2.7 amd64 [installed,automatic]
mongodb-org-shell/trusty,now 3.2.7 amd64 [installed,automatic]
mongodb-org-tools/trusty,now 3.2.7 amd64 [installed,automatic]

I have attached ticket.tar where:

$ tar -tf ticket.tar 
ticket/
ticket/diagnostic.data.tar.bz2
ticket/mongod-mmap.conf
ticket/log-mmap
ticket/log-wt
ticket/mongod-wt.conf
 
$ tar -tf diagnostic.data.tar.bz2 
mongodb-mmap/diagnostic.data/
mongodb-mmap/diagnostic.data/metrics.2016-06-28T03-03-13Z-00000
mongodb-mmap/diagnostic.data/metrics.interim
mongodb-wt/diagnostic.data/
mongodb-wt/diagnostic.data/metrics.interim
mongodb-wt/diagnostic.data/metrics.2016-06-28T05-03-02Z-00000
mongodb-wt/diagnostic.data/metrics.2016-06-28T03-03-04Z-00000

Comment by Kelsey Schubert [ 29/Jun/16 ]

Hi edgarcosta,

Thank you for opening this ticket. I've downloaded the dataset and ran the same query twice on WiredTiger with default configuration (first run with the cache empty, the second run with the cache hot). When the cache was empty it took 14 seconds, with the cache hot it took 0.2 seconds to complete. The performance is on the same order of magnitude for MMAPv1. Please see the logs of the queries below.

WiredTiger:

2016-06-29T01:38:57.747-0400 I COMMAND  [conn3] command dump.forms command: count { count: "forms", query: { dimension: 2.0, degree: 2.0 }, fields: {} } planSummary: IXSCAN { dimension: -1, field_label: 1, level_norm: -1 } fromMultiPlanner:1 keyUpdates:0 writeConflicts:0 numYields:1700 reslen:47 locks:{ Global: { acquireCount: { r: 3402 } }, Database: { acquireCount: { r: 1701 } }, Collection: { acquireCount: { r: 1701 } } } protocol:op_command 14476ms
2016-06-29T01:39:01.427-0400 I COMMAND  [conn3] command dump.forms command: count { count: "forms", query: { dimension: 2.0, degree: 2.0 }, fields: {} } planSummary: IXSCAN { dimension: -1, field_label: 1, level_norm: -1 } keyUpdates:0 writeConflicts:0 numYields:447 reslen:47 locks:{ Global: { acquireCount: { r: 896 } }, Database: { acquireCount: { r: 448 } }, Collection: { acquireCount: { r: 448 } } } protocol:op_command 216ms

MMAPv1:

2016-06-30T15:35:24.301-0400 I COMMAND  [conn8] command dump.forms command: count { count: "forms", query: { dimension: 2.0, degree: 2.0 }, fields: {} } planSummary: IXSCAN { dimension: -1, field_label: 1, level_norm: -1 } fromMultiPlanner:1 keyUpdates:0 writeConflicts:0 numYields:3015 reslen:47 locks:{ Global: { acquireCount: { r: 6032 } }, MMAPV1Journal: { acquireCount: { r: 3016 } }, Database: { acquireCount: { r: 3016 } }, Collection: { acquireCount: { R: 3016 } } } protocol:op_command 18768ms
2016-06-30T15:35:27.790-0400 I COMMAND  [conn8] command dump.forms command: count { count: "forms", query: { dimension: 2.0, degree: 2.0 }, fields: {} } planSummary: IXSCAN { dimension: -1, field_label: 1, level_norm: -1 } keyUpdates:0 writeConflicts:0 numYields:447 reslen:47 locks:{ Global: { acquireCount: { r: 896 } }, MMAPV1Journal: { acquireCount: { r: 448 } }, Database: { acquireCount: { r: 448 } }, Collection: { acquireCount: { R: 448 } } } protocol:op_command 161ms

To continue to investigate this issue, would you please archive (tar or zip) the $dbpath/diagnostic.data directory and attach it to this ticket for both the WiredTiger and MMAPv1 nodes?

Thank you,
Thomas

Generated at Thu Feb 08 04:07:32 UTC 2024 using Jira 9.7.1#970001-sha1:2222b88b221c4928ef0de3161136cc90c8356a66.