[SERVER-39915] Access to _localOplogCollection is not synchronized Created: 01/Mar/19 Updated: 20/Sep/19 Resolved: 20/Sep/19 |
|
| Status: | Closed |
| Project: | Core Server |
| Component/s: | Replication |
| Affects Version/s: | 3.4.19 |
| Fix Version/s: | None |
| Type: | Bug | Priority: | Major - P3 |
| Reporter: | Tess Avitabile (Inactive) | Assignee: | Siyuan Zhou |
| Resolution: | Won't Fix | Votes: | 0 |
| Labels: | None | ||
| Remaining Estimate: | Not Specified | ||
| Time Spent: | Not Specified | ||
| Original Estimate: | Not Specified | ||
| Issue Links: |
|
||||
| Operating System: | ALL | ||||
| Sprint: | Repl 2019-07-01 | ||||
| Participants: | |||||
| Linked BF Score: | 15 | ||||
| Description |
|
Access to _localOplogCollection is not synchronized on 3.4. We take a Global X lock when setting the pointer to null, an IX lock on local.oplog.rs when setting the pointer to a non-null value, and no lock when reading the pointer. This means that we can read that the pointer is non-null, then call a function on a null pointer, leading to an invalid access. This issue is fixed on versions 3.6 and later by |
| Comments |
| Comment by Siyuan Zhou [ 20/Sep/19 ] |
|
In 4.2+, the oplog pointer is always protected by the global lock. acquireOplogCollectionForLogging() and establishOplogCollectionForLogging either acquire locks or invariant the lock is already acquired. The oplog could not be dropped/renamed while in replset mode. The only thing destroying the oplog collection object (calling clearLocalOplogPtr) is rollback/restart catalog and shutdown, where we either have the global X lock or nothing could happen concurrently. As mentioned by tess.avitabile, this is only an issue in 3.4 and only occurred once in tests, I'd prefer to close this ticket as "Won't Fix". |