[SERVER-8722] OpCounters::incInsertInWriteLock not thread-safe. Created: 25/Feb/13 Updated: 05/Apr/17 Resolved: 22/Nov/16 |
|
| Status: | Closed |
| Project: | Core Server |
| Component/s: | Concurrency |
| Affects Version/s: | 2.2.3, 2.4.0-rc1 |
| Fix Version/s: | None |
| Type: | Bug | Priority: | Major - P3 |
| Reporter: | Andy Schwerin | Assignee: | Daniel Gottlieb (Inactive) |
| Resolution: | Done | Votes: | 0 |
| Labels: | triage | ||
| Remaining Estimate: | Not Specified | ||
| Time Spent: | Not Specified | ||
| Original Estimate: | Not Specified | ||
| Backwards Compatibility: | Fully Compatible |
| Operating System: | ALL |
| Steps To Reproduce: | Should be possible to demonstrate by creating a test node with several logical databases, and driving a lot of w=1 inserts to it, scattered across the databases. If you keep track of the number of inserts, you should find that it is greater than what the database reports. It might help to do multi-inserts, it might help to do only single inserts; I'm not sure. |
| Sprint: | Storage 2016-12-12 |
| Participants: |
| Description |
|
Once upon a time, the _insert counter in the OpCounters class was only ever incremented when the single global writelock was held. In that world, the implementation that uses a non-atomic operation to increment the counter wouldn't lead to lost counts, though it could hypothetically lead to incorrect reads if the _insert counter weren't 4-byte aligned. Today, the function is often called without a global lock held (only a per database lock), making these adds lossy. It would be wise to fix this, perhaps by consolidating the implementation of AtomicUInt with the AtomicWord<T> or at least the AtomicIntrinsics<T> from mongo/platform/atomic_word.h. This would also allow us to make these counters 64-bit instead of 32-bit, in case we're ever worried about rollover. |
| Comments |
| Comment by Daniel Gottlieb (Inactive) [ 22/Nov/16 ] |
|
I put some effort into reproducing this and found nothing. Talked with schwerin and it was concluded this may have been fixed as early as 2.6. |
| Comment by Dwight Merriman [ 25/Feb/13 ] |
|
see also base/counter.h. that file seems to be new (and thus i know little about it) but relevant potentially. |