[JAVA-1925] Tailable cursor blocks on tryNext Created: 14/Aug/15 Updated: 02/Dec/15 Resolved: 08/Oct/15 |
|
| Status: | Closed |
| Project: | Java Driver |
| Component/s: | Query Operations |
| Affects Version/s: | 2.13.2, 3.0.3 |
| Fix Version/s: | None |
| Type: | Bug | Priority: | Major - P3 |
| Reporter: | Devin Smith | Assignee: | Unassigned |
| Resolution: | Cannot Reproduce | Votes: | 0 |
| Labels: | Bug, driver, query, replicaset | ||
| Remaining Estimate: | Not Specified | ||
| Time Spent: | Not Specified | ||
| Original Estimate: | Not Specified | ||
| Description |
|
I can pretty reliably cause an oplog tailable cursor to block on tryNext by running a bunch of cursors in parallel. Sometimes my test program finishes successfully, other times it blocks on tryNext. The database is relatively small rslocal:PRIMARY> db.oplog.rs.stats() I'm running a 3.0.4 server standalone replica. I'm able to demonstrate the exact same issue w/ 2.13.2 and 3.0.3 drivers: https://github.com/devinrsmith/mongocursorexample https://github.com/devinrsmith/mongocursorexample/tree/v3 Am I using the api/cursors incorrectly? |
| Comments |
| Comment by Arcadius Ahouansou [ 02/Dec/15 ] | ||||||||||
|
Thank you very much drsmith | ||||||||||
| Comment by Devin Smith [ 30/Nov/15 ] | ||||||||||
|
We are using docker mongo:3.0.6 in prod (NO vagrant) without issue so far. On Mon, Nov 30, 2015 at 11:40 AM Arcadius Ahouansou (JIRA) <jira@mongodb.org> | ||||||||||
| Comment by Arcadius Ahouansou [ 30/Nov/15 ] | ||||||||||
|
Hello again drsmith Please, are you using docker in Prod or just for Dev? Thanks. | ||||||||||
| Comment by Devin Smith [ 30/Nov/15 ] | ||||||||||
|
Unfortunately not, but didn't dig into it too far. Running mongo locally On Mon, Nov 30, 2015 at 10:00 AM Arcadius Ahouansou (JIRA) <jira@mongodb.org> | ||||||||||
| Comment by Arcadius Ahouansou [ 30/Nov/15 ] | ||||||||||
|
Thanks drsmith | ||||||||||
| Comment by Devin Smith [ 30/Nov/15 ] | ||||||||||
|
Yes, I believe it was. On Mon, Nov 30, 2015, 7:23 AM Arcadius Ahouansou (JIRA) <jira@mongodb.org> | ||||||||||
| Comment by Arcadius Ahouansou [ 30/Nov/15 ] | ||||||||||
|
Hello drsmith Thanks. | ||||||||||
| Comment by Jeffrey Yemin [ 08/Oct/15 ] | ||||||||||
|
Thanks Devin, I'm closing this issue now but please add further comments should you have more information at any time in the future, and I'll reopen it. | ||||||||||
| Comment by Devin Smith [ 17/Aug/15 ] | ||||||||||
|
So, my coworkers was able to reproduce this on one of his mongo instances, but not another. For local dev we use Vagrant (on OSX), and it seems the only times we can repro the issue is in Vagrant. I don't have the time to trace it any further than this at the moment... might try to get a minimally reproducible vagrant image sometime and pass it off to both teams for further investigation... for now though, we'll just be weary of mongo in Vagrant. | ||||||||||
| Comment by Devin Smith [ 14/Aug/15 ] | ||||||||||
|
It seems to me that I get mongo into a bad state by doing lots of cursors concurrently... then any manner of using cursors (serial or parallel) seems to fail. Maybe you can reproduce it by upping the number of concurrent cursors? I've updated my code to count instead of storing the results, as well as loop through 100 times. I'm also logging any MongoExceptions I receive, but specifically, that's not the case that I'm worried about. | ||||||||||
| Comment by Devin Smith [ 14/Aug/15 ] | ||||||||||
|
mongod.conf shown in previous comment, default storage engine. | ||||||||||
| Comment by Jeffrey Yemin [ 14/Aug/15 ] | ||||||||||
|
What's in mongod.conf? Is this still with wired tiger? | ||||||||||
| Comment by Devin Smith [ 14/Aug/15 ] | ||||||||||
Brought up by docker-compose:
| ||||||||||
| Comment by Jeffrey Yemin [ 14/Aug/15 ] | ||||||||||
|
With what server configuration? | ||||||||||
| Comment by Devin Smith [ 14/Aug/15 ] | ||||||||||
|
I just modified my test, and I was able to hit the issue using only 1 cursor at a time: public static void main( String[] args ) throws UnknownHostException, InterruptedException { final DBCollection oplog = db.getCollection("oplog.rs"); } | ||||||||||
| Comment by Devin Smith [ 14/Aug/15 ] | ||||||||||
|
I've got 3.0.4 running with the default storage engine. At first, things were looking really good. I did my test about 10 times, each with 10 cursors or so and things finished smoothly. Then it started blocking. Maybe there is some cursor buildup on the server that is causing issues? | ||||||||||
| Comment by Jeffrey Yemin [ 14/Aug/15 ] | ||||||||||
|
I pegged 8 CPUs for a while, but even with 16 runners the program you supplied eventually completed against 3.0 with the default storage engine. Curious to see your results as well. The only change I made: just to avoid garbage collection overhead, I changed your program to count the number of results rather than storing them in a list. In my environment, there were about 314K oplog entries. | ||||||||||
| Comment by Devin Smith [ 14/Aug/15 ] | ||||||||||
|
I'll try to repro w/ 3.0 no wired tiger. | ||||||||||
| Comment by Jeffrey Yemin [ 14/Aug/15 ] | ||||||||||
|
Hi Devin, Are you able to reproduce this with MongoDB 2.6 or MongoDB 3.0 with the default storage engine? | ||||||||||
| Comment by Devin Smith [ 14/Aug/15 ] | ||||||||||
|
I should also mention that the DB's data is static while I'm running these tests. |