[SERVER-13316] sorts with multiple batches with small collections can be slower in rc2 Created: 22/Mar/14  Updated: 10/Dec/14  Resolved: 25/Mar/14

Status: Closed
Project: Core Server
Component/s: Querying
Affects Version/s: 2.6.0-rc2
Fix Version/s: None

Type: Bug Priority: Major - P3
Reporter: Ben Becker Assignee: David Storch
Resolution: Won't Fix Votes: 0
Labels: 26qa, performance
Remaining Estimate: Not Specified
Time Spent: Not Specified
Original Estimate: Not Specified

Attachments: Text File disableSplitLimitedSort.patch     File op_bench_repro.cpp     File scoped_probe.cpp     File scoped_probe.h     File scoped_timer.h     File server-13316-with-patch.svg     File server-13316.svg     File server13316.js    
Issue Links:
Related
is related to SERVER-12438 batch size with an unindexed sort in ... Closed
Operating System: ALL
Steps To Reproduce:
  1. insert n documents
  2. query for a range of (for example) 100 documents with a batch size of 10
  3. retrieve the first batch
  4. remove a document later in the batch
  5. exhaust the cursor

The attached code is more of a stress test. It can be run against a v2.4 and v2.6 server, and will produce a .json file for each run, which can be imported/aggregated as described below:

First, build with something like:

c++ -I/opt/libmongoclient/include -std=c++11 -o op_bench_repro.o -c op_bench_repro.cpp
c++ -I/opt/libmongoclient/include -std=c++11 -o scoped_probe.o -c scoped_probe.cpp
c++ op_bench_repro.o scoped_probe.o -o op_bench_repro -rdynamic -lmongoclient -lboost_thread-mt -lboost_filesystem 
-lboost_program_options -lboost_system 

Then import the results to mongodb:

echo '[' > tmp.json
find . -name '*.json' |xargs cat >> tmp.json 
echo ']' >> tmp.json 
mongoimport --jsonArray --collection regression tmp.json

Then aggregate:

db.regression.aggregate({
      $group: {
          _id:        {ServerVersion:'$ServerVersion', TestName:'$TestName'},
          Seconds:    {'$sum': "$ClockSeconds"}
        }
    });

Participants:

 Description   

There is a performance regression in v2.6-rc2 when querying for multiple batches of documents with a sort while one (or more) of those documents are removed.

The attached code takes 41 seconds to execute in v2.6-rc2, and 25 seconds to execute in v2.4.9.


Generated at Thu Feb 08 03:31:20 UTC 2024 using Jira 9.7.1#970001-sha1:2222b88b221c4928ef0de3161136cc90c8356a66.