Uploaded image for project: 'Core Server'
  1. Core Server
  2. SERVER-17261

mongod rc8/rc9-pre WT OOM

    XMLWordPrintable

    Details

    • Type: Bug
    • Status: Closed
    • Priority: Critical - P2
    • Resolution: Fixed
    • Affects Version/s: 3.0.0-rc8
    • Fix Version/s: 3.0.0-rc9
    • Component/s: Storage, WiredTiger
    • Labels:
    • Backwards Compatibility:
      Fully Compatible

      Description

      Unlike prior RCs, WiredTiger-enabled mongod (rc7, rc8, 2/12 nightly 79492d9cc1885d74b31b5fe24194dbc227096d6e, rc9-pre ea5f871b550c1c3a8a5f0cd749fb47570557a067) in a standalone topology seems to grow the heap without bound until the Linux kernel kills the process. I assume this is the heap growing because dirty .data pages (like those in the WT cache) would simply be paged out (written to block I/O) by the kernel if an acute memory deficit occurs.

      We found this in a sysbench-based longevity (stress) test after about 15 hours. To get started, Sysbench loads data (320 million docs) with 8 threads then goes into a 64 thread execute phase with a mix of read and write operations. The OOM occured during the 64-thread execute phase.

      We did not see any OOM with a seven day YCSB test. YCSB runs with 8 threads.

      We have seen this OOM when running against SSD block storage and with rotating magnetic hard disk.

      We have seen the OOM a few times now in rc7 and rc8, only when running the sysbench 64 thread execute workload.

      Reproduction steps:

      A. procure a multi-socket machine with 12 cores, like a C3 8XL in EC2

      B. start with a clean database and a standalone single node of rc8 mongod configured for wiredTiger

      rm -rf /data/db/* ; numactl --interleave=all ./mongod --dbpath /data/db --logpath mongodb-sysbench.log --storageEngine wiredTiger --fork
      

      C. checkout the sysbench benchmark and modify config.bash:

      git clone https://github.com/tmcallaghan/sysbench-mongodb.git
      git checkout 7c8e12916fa1c7a58ff6b36c6ba4bfc28453104c
      

      diff --git a/config.bash b/config.bash
      index aaa346d..abf5fcb 100644
      --- a/config.bash
      +++ b/config.bash
      @@ -39,7 +39,7 @@ export NUM_COLLECTIONS=16
       
       # number of documents to maintain per collection
       #   valid values : integer > 0
      -export NUM_DOCUMENTS_PER_COLLECTION=10000000
      +export NUM_DOCUMENTS_PER_COLLECTION=20000000
       
       # total number of documents to insert per "batch"
       #   valid values : integer > 0
      @@ -55,7 +55,8 @@ export NUM_WRITER_THREADS=64
       
       # run the benchmark for this many minutes
       #   valid values : intever > 0
      -export RUN_TIME_MINUTES=10
      +#export RUN_TIME_MINUTES=10
      +export RUN_TIME_MINUTES=10080
       export RUN_TIME_SECONDS=$[RUN_TIME_MINUTES*60]
       
       # write concern for the benchmark client
      @@ -106,12 +107,12 @@ export SYSBENCH_DISTINCT_RANGES=1
       
       # number of indexed updates per sysbench "transaction"
       #   valid values : integer >= 0
      -export SYSBENCH_INDEX_UPDATES=1
      +export SYSBENCH_INDEX_UPDATES=3
       
       # number of non-indexed updates per sysbench "transaction"
       #   valid values : integer >= 0
      -export SYSBENCH_NON_INDEX_UPDATES=1
      +export SYSBENCH_NON_INDEX_UPDATES=3
       
       # number of delete/insert operations per sysbench "transaction"
       #   valid values : integer >= 0
      -export SYSBENCH_INSERTS=1
      +export SYSBENCH_INSERTS=2
      diff --git a/src/jmongosysbenchexecute.java b/src/jmongosysbenchexecute.java
      index bf35445..fa82032 100644
      --- a/src/jmongosysbenchexecute.java
      +++ b/src/jmongosysbenchexecute.java
      @@ -164,8 +164,7 @@ public class jmongosysbenchexecute {
       
               MongoClientOptions clientOptions = new MongoClientOptions.Builder().connectionsPerHost(2048).socketTimeout(60000).writeConcern(myWC).build();
               ServerAddress srvrAdd = new ServerAddress(serverName,serverPort);
      -        MongoCredential credential = MongoCredential.createMongoCRCredential(userName, dbName, passWord.toCharArray());
      -        MongoClient m = new MongoClient(srvrAdd, Arrays.asList(credential));
      +        MongoClient m = new MongoClient(srvrAdd);
       
               logMe("mongoOptions | " + m.getMongoOptions().toString());
               logMe("mongoWriteConcern | " + m.getWriteConcern().toString());
      diff --git a/src/jmongosysbenchload.java b/src/jmongosysbenchload.java
      index 420039e..cc8a4f1 100644
      --- a/src/jmongosysbenchload.java
      +++ b/src/jmongosysbenchload.java
      @@ -116,8 +116,7 @@ public class jmongosysbenchload {
       
               MongoClientOptions clientOptions = new MongoClientOptions.Builder().connectionsPerHost(2048).socketTimeout(60000).writeConcern(myWC).build();
               ServerAddress srvrAdd = new ServerAddress(serverName,serverPort);
      -        MongoCredential credential = MongoCredential.createMongoCRCredential(userName, dbName, passWord.toCharArray());
      -        MongoClient m = new MongoClient(srvrAdd, Arrays.asList(credential));
      +        MongoClient m = new MongoClient(srvrAdd);
       
               logMe("mongoOptions | " + m.getMongoOptions().toString());
               logMe("mongoWriteConcern | " + m.getWriteConcern().toString());
      

      D. download the 2.12.4 Java driver for mongoDB

      curl -O http://central.maven.org/maven2/org/mongodb/mongo-java-driver/2.12.4/mongo-java-driver-2.12.4.jar
      

      E. run the workload

      CLASSPATH=`pwd`/mongo-java-driver-2.12.4.jar numactl --interleave=all ./run.simple.bash
      

        Attachments

        1. cache.png
          cache.png
          154 kB
        2. heavy-reads.png
          heavy-reads.png
          233 kB
        3. iostat.log
          14 kB
        4. small-dropout.png
          small-dropout.png
          109 kB
        5. ss.log
          235 kB
        6. timeseries-ec2-c3_8xl_sysbench_execute_full_cache_oom.html
          172 kB
        7. timeseries-ec2-c3_8xl_sysbench_execute_full_cache_oom.png
          timeseries-ec2-c3_8xl_sysbench_execute_full_cache_oom.png
          814 kB

          Activity

            People

            • Votes:
              1 Vote for this issue
              Watchers:
              7 Start watching this issue

              Dates

              • Created:
                Updated:
                Resolved: