[CDRIVER-2736] Mongoc cursor error when running find_with_opts against a sharded cluster Created: 11/Jul/18 Updated: 28/Oct/23 Resolved: 17/Jul/18 |
|
| Status: | Closed |
| Project: | C Driver |
| Component/s: | None |
| Affects Version/s: | 1.5.0 |
| Fix Version/s: | 1.12.0 |
| Type: | Bug | Priority: | Critical - P2 |
| Reporter: | Spencer Mckenney | Assignee: | A. Jesse Jiryu Davis |
| Resolution: | Fixed | Votes: | 0 |
| Labels: | None | ||
| Remaining Estimate: | Not Specified | ||
| Time Spent: | Not Specified | ||
| Original Estimate: | Not Specified | ||
| Attachments: |
|
||||||||||||
| Issue Links: |
|
||||||||||||
| Description |
|
The cursor implementation in the mongo-c-driver makes this assumption: If the server returns a non-zero cursor id, then the cursor isn't finished because there are more documents to iterate through. Right now, sharded clusters return a non-zero cursor id even when the document limit has been reached. So, our cursor throws an error when it gets conflicting information: The user-defined limit has been reached but the cursor id isn't zero. Here is the command that fails (all the way back to mongo-c-driver version 1.5):
Here is the assertion that fails when the command is ran:
I attached an example-client.c file that can be ran to reproduce the bug. |
| Comments |
| Comment by Githook User [ 17/Jul/18 ] | ||||||||||||||||||||||||||||||||||||||||||||||
|
Author: {'email': 'jesse@mongodb.com', 'name': 'A. Jesse Jiryu Davis', 'username': 'ajdavis'}Message: | ||||||||||||||||||||||||||||||||||||||||||||||
| Comment by Githook User [ 17/Jul/18 ] | ||||||||||||||||||||||||||||||||||||||||||||||
|
Author: {'email': 'jesse@mongodb.com', 'name': 'A. Jesse Jiryu Davis', 'username': 'ajdavis'}Message: | ||||||||||||||||||||||||||||||||||||||||||||||
| Comment by Derick Rethans [ 16/Jul/18 ] | ||||||||||||||||||||||||||||||||||||||||||||||
|
I just like it in a soon-release. The reason why I suggested 1.11.x, is that for the PHP driver, many times, going from libmongoc 1.x to 1.x+1 involved API changes that required extra work on our side. We prefer only to have to make API changes going from 1.x to 1.x+1 due to making it lower-risk. Let's chat about the policy topic in tomorrow's meeting. | ||||||||||||||||||||||||||||||||||||||||||||||
| Comment by A. Jesse Jiryu Davis [ 16/Jul/18 ] | ||||||||||||||||||||||||||||||||||||||||||||||
|
Does it actually matter whether the release in which this is fixed is named 1.11.x? Or is the important thing that the bugfix is released soon? We can put this bugfix in 1.12.0 and release 1.12.0 right away. Is that acceptable? (My policy is to fix old bugs in minor releases, and new bugs in patch releases. So, 1.11.x only fixes bugs introduced in 1.11.0 or a previous 1.11.x patch release. A bug we've had since 1.5.0 can't be fixed in 1.11.1. The purpose of this policy is to guarantee that each patch release is lower-risk than each minor release, by only changing code to fix bugs since the last minor release.) | ||||||||||||||||||||||||||||||||||||||||||||||
| Comment by Derick Rethans [ 16/Jul/18 ] | ||||||||||||||||||||||||||||||||||||||||||||||
|
FWIW, I also just ran into this running the Crud Spec Functional PHP Library tests against a sharded cluster now we have this set-up with MO (for Travis). Our tests abort with:
I would like to argue for this being included in a 1.11.x release. Backtrace:
| ||||||||||||||||||||||||||||||||||||||||||||||
| Comment by A. Jesse Jiryu Davis [ 12/Jul/18 ] | ||||||||||||||||||||||||||||||||||||||||||||||
|
Let's not assert on the relationship b/w limit, batchSize, and cursor id. (Don't crash no matter the server does.) Let's keep sending getMore until cursor id is 0, even if limit is reached. | ||||||||||||||||||||||||||||||||||||||||||||||
| Comment by Kevin Albertson [ 11/Jul/18 ] | ||||||||||||||||||||||||||||||||||||||||||||||
|
This showed up when implementing the CRUD spec tests in
|