[SERVER-1804] Fixed race conditions in the C++ driver (BackgroundJob) Created: 16/Sep/10 Updated: 12/Jul/16 Resolved: 23/Oct/10 |
|
| Status: | Closed |
| Project: | Core Server |
| Component/s: | Internal Client |
| Affects Version/s: | 1.6.2 |
| Fix Version/s: | 1.7.2 |
| Type: | Bug | Priority: | Major - P3 |
| Reporter: | Jason Toffaletti | Assignee: | Alberto Lerner |
| Resolution: | Done | Votes: | 0 |
| Labels: | None | ||
| Remaining Estimate: | Not Specified | ||
| Time Spent: | Not Specified | ||
| Original Estimate: | Not Specified | ||
| Attachments: |
|
| Operating System: | ALL |
| Participants: |
| Description |
|
While using the C++ driver, my application would randomly fail to connect under load. I tracked the problem down to a race condition in BackgroundJob. I have attached a patch which fixes the race condition. The problem was the inefficient sleepmillis() loops, which would cause the connection timeout to be reached even though the connection was successful. |
| Comments |
| Comment by Alberto Lerner [ 23/Oct/10 ] |
|
Author: {'login': 'erh', 'name': 'Eliot Horowitz', 'email': 'eliot@10gen.com'}Message: fix overflow in BackgroundJob::wait |
| Comment by Jason Toffaletti [ 21/Oct/10 ] |
|
Sorry, I haven't had time to test this, been very busy lately. The new code looks correct though. Feel free to close the bug. As an aside, I used helgrind to locate this bug initially, so it might be worth looking into if you aren't already aware of it. |
| Comment by Alberto Lerner [ 21/Oct/10 ] |
|
Jason, any news on this? |
| Comment by Alberto Lerner [ 13/Oct/10 ] |
|
Jason, could you check if the latest change addressed it? |
| Comment by auto [ 13/Oct/10 ] |
|
Author: {'login': 'alerner', 'name': 'Alberto Lerner', 'email': 'alerner@10gen.com'}Message: |
| Comment by auto [ 13/Oct/10 ] |
|
Author: {'login': 'alerner', 'name': 'Alberto Lerner', 'email': 'alerner@10gen.com'}Message: |
| Comment by Jason Toffaletti [ 17/Sep/10 ] |
|
I tried http://downloads.mongodb.org/cxx-driver/mongodb-linux-x86_64-latest.tgz before writing my patch. It was still causing the race condition and in addition had a segfault where std::stack.empty() would return false yet std::stack.size() would return 0 in the PoolForHost code. |
| Comment by Eliot Horowitz (Inactive) [ 17/Sep/10 ] |
|
We may have already fixed this i master. |