[SERVER-9580] Multiple tailable cursors against the same collection shows high cpu usage on server Created: 03/May/13  Updated: 02/Aug/17  Resolved: 21/May/15

Status: Closed
Project: Core Server
Component/s: Performance
Affects Version/s: 2.4.3
Fix Version/s: None

Type: Bug Priority: Major - P3
Reporter: Christian Amor Kvalheim Assignee: David Storch
Resolution: Duplicate Votes: 24
Labels: None
Remaining Estimate: Not Specified
Time Spent: Not Specified
Original Estimate: Not Specified

Issue Links:
Depends
depends on SERVER-6405 Wrapper for NamespaceDetails and Name... Closed
Duplicate
duplicates SERVER-18184 Add awaitData support to getMore command Closed
is duplicated by SERVER-10226 Excessive CPU usage for idle tailable... Closed
Related
related to SERVER-14802 improve sleepmillis() resolution unde... Closed
Operating System: ALL
Steps To Reproduce:

install node.js
npm install -g mubsub

Code
---------------------------------------
var mubsub = require('mubsub');

var client = mubsub('mongodb://localhost:27017/test?poolSize=5',

{safe: true}

);
var channel = client.channel('mubsub');

channel.subscribe(console.log);
channel.subscribe(console.log);
channel.subscribe(console.log);
channel.subscribe(console.log);
channel.subscribe(console.log);
channel.subscribe(console.log);
channel.subscribe(console.log);
channel.subscribe(console.log);
channel.subscribe(console.log);

Participants:
Case:
Linked BF Score: 0

 Description   

Open multiple tailable cursors with or without awaitdata against a capped collection causes sustained high cpu (on my machine about 30% cpu). A single cursor is about 5% cpu.

Confirmed using the node.js driver and java driver.



 Comments   
Comment by Vladimir Zoubritsky [ 30/Jul/15 ]

Thanks for the update. Can confirm that the CPU usage is back to low again, also when using the stable version of the node driver.

Comment by Ramon Fernandez Marina [ 30/Jul/15 ]

For those watching this ticket, MongoDB 3.1.6 includes SERVER-18184 and SERVER-18841, which should effectively eliminate the high CPU usage on the server when multiple tailable cursors are used. Feel free to try it out and report back on any issues you may find.

Regards,
Ramón.

Comment by Jose Battig [ 22/May/15 ]

Ramon, you are welcome.
It was a fun issue to figure out.

Comment by Ramon Fernandez Marina [ 22/May/15 ]

jsbattig@convey.com, for the record I'm adding below the patch that we ended up testing. It's basically the code from your pull request but adapted to be mergeable with our tree at the time. Thanks for your efforts on this matter.

Regards,
Ramón.

diff --git a/src/mongo/db/catalog/collection.cpp b/src/mongo/db/catalog/collection.cpp
index 88f8cde..b2b278f 100644
--- a/src/mongo/db/catalog/collection.cpp
+++ b/src/mongo/db/catalog/collection.cpp
@@ -98,6 +98,8 @@ namespace mongo {
           _dbce( dbce ),
           _infoCache( this ),
           _indexCatalog( this ),
+          _changeSubscribers(),
+          _eventSubscriberCount(),
           _cursorManager( fullNS ) {
         _magic = 1357924;
         _indexCatalog.init(txn);
@@ -109,6 +111,28 @@ namespace mongo {
     Collection::~Collection() {
         verify( ok() );
         _magic = 0;
+        triggerChangeSubscribersNotification();
+        /* In the following code we will spin waiting for no readers waiting for data to be inserted
+           into the capped collection. A small 2ms wait time will be introduced to let the reader threads
+           catch up and get out of the waiting state after the call to triggerChangeSubscribersNotification() */
+        while(_eventSubscriberCount.load() > 0) {
+            sleepmillis(2);
+        }
+    }
+
+    void Collection::triggerChangeSubscribersNotification(){
+        _changeSubscribers.notifyAll( _changeSubscribers.now() );
+    }
+
+    NotifyAll::When Collection::waitForDocumentInsertedEvent(  NotifyAll::When when, int timeout ) {
+        _changeSubscribers.timedWaitFor( when, timeout );
+        NotifyAll::When result = _changeSubscribers.now();
+        _eventSubscriberCount.subtractAndFetch(1);
+        return result;
+    }
+
+    void Collection::subscribeToInsertedEvent() {
+        _eventSubscriberCount.addAndFetch(1);
     }
 
     bool Collection::requiresIdIndex() const {
@@ -235,6 +259,11 @@ namespace mongo {
         if ( !status.isOK() )
             return StatusWith<RecordId>( status );
 
+        /* Let's trigger event notifier only for capped collections */
+        if( isCapped() ) {
+            triggerChangeSubscribersNotification();
+        }
+
         getGlobalServiceContext()->getOpObserver()->onInsert(txn, ns(), doc);
 
         return loc;
@@ -473,7 +502,7 @@ namespace mongo {
         // Broadcast the mutation so that query results stay correct.
         _cursorManager.invalidateDocument(txn, loc, INVALIDATION_MUTATION);
 
-        Status status = 
+        Status status =
             _recordStore->updateWithDamages(txn, loc, oldRec.value(), damageSource, damages);
 
         if (status.isOK()) {
diff --git a/src/mongo/db/catalog/collection.h b/src/mongo/db/catalog/collection.h
index 9f537cb..2864b33 100644
--- a/src/mongo/db/catalog/collection.h
+++ b/src/mongo/db/catalog/collection.h
@@ -45,6 +45,7 @@
 #include "mongo/db/storage/record_store.h"
 #include "mongo/db/storage/snapshot.h"
 #include "mongo/platform/cstdint.h"
+#include "mongo/util/concurrency/synchronization.h"
 
 namespace mongo {
 
@@ -276,6 +277,18 @@ namespace mongo {
             return static_cast<int>( dataSize( txn ) / n );
         }
 
+        /* For now triggerChangeSubscribersNotification() used only to awake waiters on capped collection
+           cursor while waiting */
+        void triggerChangeSubscribersNotification();
+
+        /* subscribers to event of new document inserted into capped collection should call this method
+           and wait to be awakened */
+        NotifyAll::When waitForDocumentInsertedEvent( NotifyAll::When when, int timeout );
+        /* subscribeToInsertedEvent() should be called within a readlock. Collection will keep track of
+           count of capped collection readers in this way preventing destruction of the object in case a reader
+           is actively waiting for data while outside of the readlock */
+        void subscribeToInsertedEvent();
+
         uint64_t getIndexSize(OperationContext* opCtx,
                               BSONObjBuilder* details = NULL,
                               int scale = 1);
@@ -313,6 +326,8 @@ namespace mongo {
         DatabaseCatalogEntry* _dbce;
         CollectionInfoCache _infoCache;
         IndexCatalog _indexCatalog;
+        NotifyAll _changeSubscribers;
+        AtomicWord<uint32_t> _eventSubscriberCount;
 
         // this is mutable because read only users of the Collection class
         // use it keep state.  This seems valid as const correctness of Collection
diff --git a/src/mongo/db/instance.cpp b/src/mongo/db/instance.cpp
index 4554866..7e65961 100644
--- a/src/mongo/db/instance.cpp
+++ b/src/mongo/db/instance.cpp
@@ -1,4 +1,4 @@
-// instance.cpp 
+// instance.cpp
 
 /**
 *    Copyright (C) 2008 10gen Inc.
@@ -474,7 +474,7 @@ namespace {
             responseComponent = LogComponent::kCommand;
         }
 
-        bool shouldLog = logger::globalLogDomain()->shouldLog(responseComponent, 
+        bool shouldLog = logger::globalLogDomain()->shouldLog(responseComponent,
                                                               logger::LogSeverity::Debug(1));
 
         if ( op == dbQuery ) {
@@ -842,6 +842,7 @@ namespace {
         bool exhaust = false;
         QueryResult::View msgdata = 0;
         Timestamp last;
+        NotifyAll::When lastWaitTime = 0;
         while( 1 ) {
             bool isCursorAuthorized = false;
             try {
@@ -888,7 +889,7 @@ namespace {
                 ok = false;
                 break;
             }
-            
+
             if (msgdata.view2ptr() == 0) {
                 // this should only happen with QueryOption_AwaitData
                 exhaust = false;
@@ -906,13 +907,30 @@ namespace {
                 pass++;
                 if (kDebugBuild)
                     sleepmillis(20);
-                else
-                    sleepmillis(2);
-                
+
+                else {
+                    Collection* collection = 0;
+                    {
+                        // Let's use a read lock to acquire a pointer to the collection object
+                        const NamespaceString nss(ns);
+                        scoped_ptr<AutoGetCollectionForRead> ctx(new AutoGetCollectionForRead(txn, nss));
+                        collection = ctx->getCollection();
+                        /* TODO: Replace this number when changes (if ever) changes are merged into upstream */
+                        uassert(77383, "collection dropped between newGetMore calls", collection);
+                        /* This will ensure our collection was not destroyed until we call waitForDocumentInsertedEvent()
+                           because we are going to be outside of the lock to call waitForDocumentInsertedEvent().
+                           We can't do that inside this block, because otherwise we will be blocking the collection too
+                           long if no new documents are inserted on the capped collection.
+                         */
+                        collection->subscribeToInsertedEvent();
+                    }
+                    lastWaitTime = collection->waitForDocumentInsertedEvent(lastWaitTime, 50);
+                }
+
                 // note: the 1100 is beacuse of the waitForDifferent above
                 // should eventually clean this up a bit
                 curop.setExpectedLatencyMs( 1100 + timer->millis() );
-                
+
                 continue;
             }
             break;
@@ -1237,7 +1255,7 @@ namespace {
 
         log(LogComponent::kControl) << "now exiting" << endl;
 
-        // Execute the graceful shutdown tasks, such as flushing the outstanding journal 
+        // Execute the graceful shutdown tasks, such as flushing the outstanding journal
         // and data files, close sockets, etc.
         try {
             shutdownServer();
@@ -1298,13 +1316,13 @@ namespace {
         boost::lock_guard<boost::mutex> lk(mutex);
         int old = level;
         log() << "diagLogging level=" << newLevel << endl;
-        if( f == 0 ) { 
+        if( f == 0 ) {
             openFile();
         }
         level = newLevel; // must be done AFTER f is set
         return old;
     }
-    
+
     void DiagLog::flush() {
         if ( level ) {
             log() << "flushing diag log" << endl;
@@ -1312,14 +1330,14 @@ namespace {
             f->flush();
         }
     }
-    
+
     void DiagLog::writeop(char *data,int len) {
         if ( level & 1 ) {
             boost::lock_guard<boost::mutex> lk(mutex);
             f->write(data,len);
         }
     }
-    
+
     void DiagLog::readop(char *data, int len) {
         if ( level & 2 ) {
             bool log = (level & 4) == 0;
diff --git a/src/mongo/util/concurrency/synchronization.cpp b/src/mongo/util/concurrency/synchronization.cpp
index d7cf357..c4dd05f 100644
--- a/src/mongo/util/concurrency/synchronization.cpp
+++ b/src/mongo/util/concurrency/synchronization.cpp
@@ -99,6 +99,15 @@ namespace {
         }
     }
 
+    bool NotifyAll::timedWaitFor( When e, int millis ) {
+        boost::unique_lock<boost::mutex> lock( _mutex );
+        ++_nWaiting;
+        while( _lastDone < e ) {
+            if( ! _condition.timed_wait( lock, boost::posix_time::milliseconds( millis ) ) ) break;
+        }
+        return _lastDone >= e;
+    }
+
     void NotifyAll::awaitBeyondNow() { 
         boost::unique_lock<boost::mutex> lock( _mutex );
         ++_nWaiting;
@@ -108,6 +117,16 @@ namespace {
         }
     }
 
+    bool NotifyAll::timedAwaitBeyondNow( int millis ) {
+        boost::unique_lock<boost::mutex> lock( _mutex );
+        ++_nWaiting;
+        When e = ++_lastReturned;
+        while( _lastDone <= e ) {
+            if( ! _condition.timed_wait( lock, boost::posix_time::milliseconds( millis ) ) ) break;
+        }
+        return _lastDone > e;
+    }
+
     void NotifyAll::notifyAll(When e) {
         boost::unique_lock<boost::mutex> lock( _mutex );
         _lastDone = e;
diff --git a/src/mongo/util/concurrency/synchronization.h b/src/mongo/util/concurrency/synchronization.h
index 39ddc4e..60b0b5d 100644
--- a/src/mongo/util/concurrency/synchronization.h
+++ b/src/mongo/util/concurrency/synchronization.h
@@ -99,10 +99,13 @@ namespace mongo {
             call are ignored -- we are looking for a fresh event.
         */
         void waitFor(When);
+        bool timedWaitFor( When, int millis );
 
         /** a bit faster than waitFor( now() ) */
         void awaitBeyondNow();
 
+        bool timedAwaitBeyondNow( int millis );
+
         /** may be called multiple times. notifies all waiters */
         void notifyAll(When);
 

Comment by David Storch [ 21/May/15 ]

Closing as a duplicate of SERVER-18184.

Comment by David Storch [ 18/May/15 ]

Sounds good, glad to hear this should work for you. Please watch SERVER-18184 for progress updates.

Best,
Dave

Comment by Jose Battig [ 18/May/15 ]

David, if the new approach allows to tail on capped collection without using a busy-wait strategy as currently implemented, then I'm good with the solution.
We have good test cases to verify thoroughput of a single Mongod operating in this mode, so we will eagerly test it out once official 3.2 release is out.
For now we will keep our custom build with our thread-blocking strategy using "event".

Comment by David Storch [ 18/May/15 ]

Hi jsbattig@convey.com, if you are satisfied with the resolution of this ticket via the currently-in-development awaitData support for the getMore command, then yes please close the Pull Request. We do plan to resolve this ticket once the awaitData work has been pushed to the master branch.

In order to take advantage of the find/getMore commands you will have to upgrade both your C driver and mongod/mongos deployment. As far as I know, officially supported drivers plan to add support for the find/getMore commands in time for the 3.2 release.

Comment by Jose Battig [ 18/May/15 ]

Dave, thanks for the update, so, if

We currently use the official C-Driver.
This is how we use it:

_this->cursor = mongoc_collection_find( _this->collection, ( mongoc_query_flags_t )( MONGOC_QUERY_TAILABLE_CURSOR | MONGOC_QUERY_AWAIT_DATA ), 0, 0, 0, query, NULL, NULL );

By briefly checking the underlying C driver, I think it uses getMore on function _mongoc_cursor_next()

Also, on another note. Should I then close the Pull Request that originated the code review in this ticket?

Comment by David Storch [ 18/May/15 ]

Hi all,

A quick update on this ticket:

We are currently developing a new path for answering find and getMore operations under SERVER-15176. In addition to accepting the existing OP_QUERY and OP_GET_MORE wire protocol messages to run queries (as is done by 3.0.x and all prior versions), 3.2 servers will accept commands called find and getMore for performing these operations.

We are making a few improvements as part of the find/getMore command path. One of them is that a find command with the awaitData option set for tailing a capped collection will block on a condition variable and be notified of new insertions, rather than busy waiting. This should resolve the high CPU usage as the result of several busy waiting threads. This awaitData work is being tracked by SERVER-18184, a subtask of SERVER-15176, and is currently scheduled to be resolved as part of the 3.1.4 development release.

Note that in order to take advantage of the fix, you will have to be running 3.2 as well as using a client driver that supports the find/getMore commands.

Best,
Dave

Comment by Jose Battig [ 18/May/15 ]

Any news ?

Comment by Jose Battig [ 17/Apr/15 ]

Ramon, thanks for the update!

Comment by Ramon Fernandez Marina [ 17/Apr/15 ]

Here's a quick update jsbattig@convey.com: the results of testing look good, and consensus is that the approach used in your code makes a lot of sense. However, since your patch contains no tests for this new functionality we have to be very careful and make sure there are no negative side effects.

The good news is not only that we also want this functionality implemented, but also the server codebase is in a much better state to incorporate this functionality now than when the ticket was originally created. We're discussing this ticket internally and we'll post updates here.

Cheers,
Ramón.

Comment by Jose Battig [ 15/Apr/15 ]

Ramon, thanks for resuscitating this so quickly.
It has been so long since I started that the branch where I did it might not be in the best of shapes, besides, I ended up reverting to merges rather than rebase of the branch a while ago because of issues I had trying to keep the branch sync.
I think the best way is to apply the diff with the current state.
Feel free to reach me with questions. You can reach by by skype too using jsbattig username if you need.

Comment by Ramon Fernandez Marina [ 15/Apr/15 ]

Thanks for taking the time to update the pull request jsbattig@convey.com. I'm having trouble applying it (git am -s), probably because of the intermediate commits. I'll see if I can just apply the last content and run it through our internal testing.

Comment by Jose Battig [ 15/Apr/15 ]

I've updated Pull Request again. Should be mergeable. What's the plan with this? Is it going to be fixed?

Comment by Jose Battig [ 07/Nov/14 ]

Hey S S, can you try the changes to the pull https://github.com/mongodb/mongo/pull/622 ?
They add some protection code to avoid a situation when capped collection is being while there's active tailing readers.
The code is relatively rough, but anyway I think it will be worth by MongoDB Kernel experts to review how they want this synchronization implemented.

Comment by Jose Sebastian Battig [ 22/Sep/14 ]

S S, I haven't seen the issue, but on the same token, our use case for capped collections and tailable cursors very carefully avoids these kind of situations.
The application that makes extensive use of this mechanism is designed a such a way that capped collections are cleaned up when there's assurances that they are not used any longer and there's no readers hooked to them tailing.

Having said that, I don't discard that this case is possible. If you reviewed the code, the nature of the change creates a harder link between the session tailing and the capped collection internal object to allow for event based awakening. I may have missed some code to protect against the case of the capped collection object being destroyed by my lack of depth of understanding of MongoDB internal dynamics.
Hopefully someone from the MongoDB team kernel development takes a serious look to my pull request and finds the necessary adjustments to make this patch solid.

Sebastian

Comment by S S [ 21/Sep/14 ]

Hi Jose and Vladimir,
I tried using Vladimir's patch file on Ubuntu 12.04, applied to source of 2.6.4. It worked well, except for one case where we found a bug. When dropping a capped collection while another process is using a tailable cursor on it, the mongodb process crashes. Traceback from the log:

2014-09-21T14:08:04.209+0300 [conn8469] CMD: drop mydb.mycoll
2014-09-21T14:08:04.294+0300 [conn8469] SEVERE: Got signal: 6 (Aborted).
Backtrace:0xea4e43 0xea4818 0x7ff3586514a0 0x7ff358651425 0x7ff358654b8b 0x7ff35864a0ee 0x7ff35864a192 0x77dc2e 0x89017
e 0x89934c 0x89947d 0x8a0c99 0x975399 0x96fb6f 0x970b7b 0x9716fd 0xb9259c 0xa67e36 0x7b4352 0xe66e5d
/usr/local/bin/mongod(_ZN5mongo15printStackTraceERSo+0x23) [0xea4e43]
/usr/local/bin/mongod() [0xea4818]
/lib/x86_64-linux-gnu/libc.so.6(+0x364a0) [0x7ff3586514a0]
/lib/x86_64-linux-gnu/libc.so.6(gsignal+0x35) [0x7ff358651425]
/lib/x86_64-linux-gnu/libc.so.6(abort+0x17b) [0x7ff358654b8b]
/lib/x86_64-linux-gnu/libc.so.6(+0x2f0ee) [0x7ff35864a0ee]
/lib/x86_64-linux-gnu/libc.so.6(+0x2f192) [0x7ff35864a192]
/usr/local/bin/mongod() [0x77dc2e]
/usr/local/bin/mongod(_ZN5mongo10CollectionD1Ev+0x11e) [0x89017e]
/usr/local/bin/mongod(_ZN5mongo8Database28_clearCollectionCache_inlockERKNS_10StringDataE+0x14c) [0x89934c]
/usr/local/bin/mongod(_ZN5mongo8Database21_clearCollectionCacheERKNS_10StringDataE+0x3d) [0x89947d]
/usr/local/bin/mongod(_ZN5mongo8Database14dropCollectionERKNS_10StringDataE+0x599) [0x8a0c99]
/usr/local/bin/mongod(_ZN5mongo7CmdDrop3runERKSsRNS_7BSONObjEiRSsRNS_14BSONObjBuilderEb+0x359) [0x975399]
/usr/local/bin/mongod(_ZN5mongo12_execCommandEPNS_7CommandERKSsRNS_7BSONObjEiRSsRNS_14BSONObjBuilderEb+0x2f) [0x96fb6f
]
/usr/local/bin/mongod(_ZN5mongo7Command11execCommandEPS0_RNS_6ClientEiPKcRNS_7BSONObjERNS_14BSONObjBuilderEb+0xd4b) [0
x970b7b]
/usr/local/bin/mongod(_ZN5mongo12_runCommandsEPKcRNS_7BSONObjERNS_11_BufBuilderINS_16TrivialAllocatorEEERNS_14BSONObjB
uilderEbi+0x23d) [0x9716fd]
/usr/local/bin/mongod(ZN5mongo11newRunQueryERNS_7MessageERNS_12QueryMessageERNS_5CurOpES1+0x89c) [0xb9259c]
/usr/local/bin/mongod(_ZN5mongo16assembleResponseERNS_7MessageERNS_10DbResponseERKNS_11HostAndPortE+0xb36) [0xa67e36]
/usr/local/bin/mongod(_ZN5mongo16MyMessageHandler7processERNS_7MessageEPNS_21AbstractMessagingPortEPNS_9LastErrorE+0x92) [0x7b4352]
/usr/local/bin/mongod(_ZN5mongo17PortMessageServer17handleIncomingMsgEPv+0x4cd) [0xe66e5d]

Have you encountered a problem like this? Can this mean there might be other problems in this patch, when running other commands on capped collections while cursors on them are used?

Comment by Githook User [ 12/Aug/14 ]

Author:

{u'username': u'jsbattig', u'name': u'Jose Sebastian Battig', u'email': u'jsbattig@convey.com'}

Message: SERVER-14802 Fixed problem with sleepmillis() on Windows due to default timer resolution on
resolution to be set high (usually above 15ms). All sleepmillis() calls with
values lesser than timer resolution will sleep AT LEAST the timer resolution
value, making Mongo be really slow on certain use cases.

Related to SERVER-9580

Closes #618

Signed-off-by: Benety Goh <benety@mongodb.com>
Branch: master
https://github.com/mongodb/mongo/commit/e2a58d5fd4e3f0d64bb5ba10de87ca48365617fc

Comment by Jose Sebastian Battig [ 07/Aug/14 ]

Vladimir, glad to hear the patches I submitted helped you out!

Comment by Jose Sebastian Battig [ 07/Aug/14 ]

S S, I'm currently working with Mongo folks to get the fix incorporated into the main codebase. It's a slow process, we are barely working on the Sleep() function call windows kernel resolution issue (it's related to this but not the final fix).
They flagged my other pulls to work on them. I think the more complex pull which implements event based tailable cursors awakening will take a while.
Mongo folks are extremely detailed oriented with third party pull requests, so I expect some good grilling of the solution I offered.

Comment by Vladimir Zoubritsky [ 25/Jun/14 ]

Our use case also involves multiple tail requests on capped collections. Applying the patches in https://github.com/mongodb/mongo/pull/622/files and https://jira.mongodb.org/browse/SERVER-2114 has helped our case, allowing very little overhead in running mongodb on development machines, and reducing CPU usage from ~80% to ~5% in a production VM. We have packaged the patches for Ubuntu at https://build.opensuse.org/package/show/home:dottedmag:mongodb/mongodb.

Comment by S S [ 24/Apr/14 ]

Thank you, Jose.
I strongly prefer using official releases... Do you think it should be 100% safe to use your changes? Is it simple to pull it and build it myself?

Comment by Jose Sebastian Battig [ 23/Apr/14 ]

S S, did you try pulling this code: https://github.com/mongodb/mongo/pull/622/files and building mongo?
In our case, we use mongodb as a service-bus platform (same as your use case).
The the changes on the mentioned pull request we increased performance on a consumer/producer test case from around ~350 requests per second to ~1700 request per second on a single-threaded consumer and producer test case running on the same virtual machine.

Comment by S S [ 23/Apr/14 ]

I'm getting hit by this issue as well.
I'm running a complex system which relies on capped-collections/tailable-cursors for lightweight message-passing between dozens of processes. Each new client contributes 3-4% cpu, even when there are no messages at all. This imposes a big limitation on my system, to the point I consider migrating my code to a different message-passing infrastructure.
I was under the impression this issue should have been fixed in V2.5 (I can no longer find that "resolved in V2.5" comment in jira now...), but turns out this is not the case.
I would greatly appreciate it if this issue could be resolved soon.
Thanks

Comment by Jose Sebastian Battig [ 06/Feb/14 ]

Matt Kangas called to my attention this issue. I think it may be address by this pull requests I opened a few days ago: https://github.com/mongodb/mongo/pull/622

Comment by b c [ 26/Jun/13 ]

Same issue here. Please address this or provide information on when it will be addressed in the roadmap.

Comment by Rick Reynolds [ 25/Jun/13 ]

I've encountered the same thing without MubSub, just simple, straight mongoDB usage. Just sitting idle with no new records being added to the collection takes up 30% CPU.

If this behavior is by design etc. and won't be fixed I'd really like to know so I can find something else.

Thanks!

Here's the basic code with irrelevant stuff removed.

// Connect to the db
MongoClient.connect("mongodb://127.0.0.1:27017/queueDB", {},
 
  function (err, db)
  {
    if(err)
    {
      console.log('DB Error!');
      process.exit();
    }
 
    m_Database = db;
 
    db.createCollection(queueName, {capped:true, size:209715200},
 
      function(err, collection)
      {
        if(err)
          errorFn(err)
        else
        {
          m_Collection = collection;
          Seed(Listen);
        }
      }
    );
  }
);
 
function Seed(completionFn)
{
  // If the collection is empty we need to seed it so that the listener will actually wait.
  // This could be a bug in the mongo driver or mongo itself. If that gets fixed in the future
  // we can remove this.
  m_Collection.findOne(
 
    function(err, result)
    {
      if(result != null)
      {
        completionFn();
        return;
      }
 
      m_Collection.insert({processed: true}, {w: 1},
 
        function(err)
        {
          completionFn();
        }
      );
    }
  );
}
 
 
function Listen()
{
  m_Collection.find({processed: false}, {tailable: true, awaitdata: true, timeout: false, numberOfRetries: 10000},
 
    function(err, cursor)
    {
      if(err)
      {
        m_TaskFn(null, function(){}, err);
        m_TaskFn = null;
        return;
      }
 
      cursor.sort({$natural: -1});
 
      cursor.each(
 
        function (err, item)
        {
          if(item)
          {
            m_TaskFn(item.task,
 
              function()
              {
                item.processed = true;
                m_Collection.save(item, function(){});
              },
 
              null
            );
          }
          else
          {
            // Restart if there's still a real taskFn (indicating this wasn't closed)
            if(m_TaskFn != Stub_Function)
            {
              Log('Queue', 'Null Item. Restarting Listener');
              process.nextTick(Listen);
            }
          }
        }
      );
    }
  );
}

Generated at Thu Feb 08 03:20:51 UTC 2024 using Jira 9.7.1#970001-sha1:2222b88b221c4928ef0de3161136cc90c8356a66.