[CDRIVER-474] "Failure to buffer 4 bytes" error on OS X with 300 client threads Created: 02/Dec/14 Updated: 06/Jan/15 Resolved: 06/Jan/15 |
|
| Status: | Closed |
| Project: | C Driver |
| Component/s: | None |
| Affects Version/s: | 1.0.0 |
| Fix Version/s: | 1.1.0 |
| Type: | Bug | Priority: | Major - P3 |
| Reporter: | Ben Becker | Assignee: | Unassigned |
| Resolution: | Done | Votes: | 0 |
| Labels: | None | ||
| Remaining Estimate: | Not Specified | ||
| Time Spent: | Not Specified | ||
| Original Estimate: | Not Specified | ||
| Environment: |
Mac OS X 10.10.1 |
||
| Description |
|
When running mongo-sim with 300 client threads on OS X, I'm getting the following error from each thread:
Note that each thread reports this almost immediately; it's not timing out after 300 seconds or anything. You can reproduce by checking out 10gen/mongo-sim.git, building, and running the following from the build directory:
This spawns 300 client threads, one mongoc client per thread and executes a query 10,000 times total. Each thread executes a query operation 33 or 34 times, then iterates over every document in a tight loop (using mongoc_cursor_error(), mongoc_cursor_more() and mongoc_cursor_next() as outlined in cursor.hpp's Cursor::itcount() method. Note that the same error occurs when running an insert-only workload (change -w to insert_only). I have not been able to reproduce this issue on Linux. Sorry I haven't been able to narrow this down to a minimal reproducible case yet. I don't have many free cycles, but I'm happy to help anywhere I can. |
| Comments |
| Comment by Ben Becker [ 02/Dec/14 ] | |||||||||||
|
Yay, fixed! Thanks Jason! | |||||||||||
| Comment by Ben Becker [ 02/Dec/14 ] | |||||||||||
|
I was on tag 1.0.2, but it seems I have the old logic. I'll upgrade to the 1.1.0-rc0 tag and let you know!
| |||||||||||
| Comment by Mira Carey [ 02/Dec/14 ] | |||||||||||
|
Actually, what version of libbson are you using? I took a patch for adjusting mach_absolute_time to microseconds a couple of months back and I think I'm getting the units right at this point.
I see the race you're talking about, but the units seem sane. | |||||||||||
| Comment by Mira Carey [ 02/Dec/14 ] | |||||||||||
|
aah, good catch. thanks for the report | |||||||||||
| Comment by Ben Becker [ 02/Dec/14 ] | |||||||||||
|
It looks like bson_get_monotonic_time() is not thread safe due to the use of static variables in the function. Also, mach_absolute_time() needs to be adjusted; see https://developer.apple.com/library/mac/qa/qa1398/_index.html. |