[SERVER-8602] Allow multiple clients to use a single cursor. Created: 18/Feb/13  Updated: 06/Dec/22

Status: Open
Project: Core Server
Component/s: Querying
Affects Version/s: 2.2.3
Fix Version/s: features we're not sure of

Type: New Feature Priority: Minor - P4
Reporter: Robert Moore Assignee: Backlog - Query Execution
Resolution: Unresolved Votes: 2
Labels: None
Remaining Estimate: Not Specified
Time Spent: Not Specified
Original Estimate: Not Specified
Environment:

All


Assigned Teams:
Query Execution
Backwards Compatibility: Fully Compatible
Participants:

 Description   

Currently the mongod server upon seeing a cursor that is already locked returns a empty result to the client with a cursor id of zero. This will cause the client to consider the cursor exhausted and stop any active iteration.

There are cases where having multiple threads/processes reading from the same cursor is advantageous. The current processing makes handling those situations more difficult as the client cannot tell if the cursor is really exhausted or was just being actively read by another process.

Implementing this functionality would be a component of allowing clients to use a cursor as a shared work queue.



 Comments   
Comment by Andy Schwerin [ 17/Jul/13 ]

The approach we'd prefer, rather than having multiple clients share a cursor, is to make the findAndModify process more useful for this kind of work queuing. Suppose that findAndModify() could return a tailable cursor. Clients would then each establish their own tailable findAndModify cursor with the match pattern in the find including a "consumed: 0" match and the update setting "consumed: 1". You can do this with polling today, and it is the preferred approach over sharing cursors. Extending fAndM to support tailing would be a natural extension.

Comment by Andy Schwerin [ 17/Jul/13 ]

While there's nothing difficult with this patch, per se, we're not comfortable ensuring this client-synchronizing behavior going forward. I'm going to close the pull request, but leave the associated ticket open as a feature request.

Comment by Robert Moore [ 17/Jul/13 ]

Bump - Any thoughts?

Rob.

Comment by Andy Schwerin [ 10/Jun/13 ]

Ah, yes, alternative number 3, thing I didn't think of. I'll mull this over.

Comment by Robert Moore [ 10/Jun/13 ]

Sure - basically I am trying to turn MongoDB into a work/queue broker.

I am inserting the "job" documents into a capped collection and then have multiple processes pulling the documents out of the capped collection via a single shared tailable cursor (across multiple threads and processes).

I am not worried about bandwidth as much a simple, fast work distribution model. (And yes I realize that some work might get lost in the case of a fault but that is OK in my use case.)

Currently, I am using a cursor with await data set to false and all the clients are "spinning" on the cursor/getmore requests and dealing with occasional getmore response with a cursor id zero.

With this change the await data can be changed to true and things will not have to spin as much. Again, ideally there would be wait semantics for all of the cursor consumers but I'm happy to get what I can get.

Comment by Andy Schwerin [ 10/Jun/13 ]

robert.j.moore@allanbank.com, can you describe your goal from this change in a little more detail? I'm somewhat nervous of the polling semantics, and am also concerned about the complexity of a wait semantics implementation.

This proposal suggests that you're trying to improve the utilization of some resource, either available network bandwidth or available client-side computational resources. If it's network bandwidth, setting the "Exhaust" flag in queries (http://docs.mongodb.org/meta-driver/latest/legacy/mongodb-wire-protocol/#op-query) will remove the latency of getmore round trips. If it's client-side CPU utilization, client or driver changes might be more appropriate. If it's some other resource, can you describe it?

Comment by Andy Schwerin [ 08/Jun/13 ]

I think the wait semantics may be a substantial undertaking, and I don't have the time to give it the attention it needs, at present. I'm going to consult with a few other engineers at 10gen, to see if the polling-semantics solution would stand in the way of other future work, and either kangas or I will respond on this ticket.

Comment by Robert Moore [ 02/Jun/13 ]

I agree that the wait semantics (especially if the query/cursor has the "await data" bit set) would be more desirable. I'm willing to do the work to add that ability but would need a few pointers and/or a recommend method for implementing.

Rob

Comment by Andy Schwerin [ 31/May/13 ]

robert.j.moore@allanbank.com, can you describe your use case a little more? The implementation in your pull request appears to give multiple getmore requests on the same cursor a "try" semantics, useful for polling. This is a reasonable and straightforward extension of the existing implementation, but I wonder if applications would be better served by a "wait" semantics, where the second getmore waits to return until the first getmore unpins the cursor.

Comment by Robert Moore [ 18/Feb/13 ]

See https://github.com/mongodb/mongo/pull/381

Generated at Thu Feb 08 03:17:53 UTC 2024 using Jira 9.7.1#970001-sha1:2222b88b221c4928ef0de3161136cc90c8356a66.