[SERVER-40058] AutoStatsTracker's lock acquisition to read the profiling level should not conflict with secondary batch application Created: 08/Mar/19 Updated: 29/Oct/23 Resolved: 11/Mar/19 |
|
| Status: | Closed |
| Project: | Core Server |
| Component/s: | Concurrency |
| Affects Version/s: | None |
| Fix Version/s: | 4.1.9 |
| Type: | Improvement | Priority: | Major - P3 |
| Reporter: | David Storch | Assignee: | David Storch |
| Resolution: | Fixed | Votes: | 0 |
| Labels: | None | ||
| Remaining Estimate: | Not Specified | ||
| Time Spent: | Not Specified | ||
| Original Estimate: | Not Specified | ||
| Issue Links: |
|
||||||||||||
| Backwards Compatibility: | Fully Compatible | ||||||||||||
| Sprint: | Query 2019-03-25 | ||||||||||||
| Participants: | |||||||||||||
| Linked BF Score: | 0 | ||||||||||||
| Description |
|
The AutoStatsTracker is used by various code paths to record operation counters in Top and to set the correct diagnostic information on the operation's CurOp. Amongst its jobs is to determine the profiling level associated with the operation, and set it on the CurOp. Since the profiling level can be configured on a per-database basis, and since this information is stored in-memory inside the catalog subsystem's Database object, this involves acquiring the appropriate MODE_IS intent lock on the database: As implemented, it is possible for this lock acquisition to conflict with parallel oplog batch application on secondary nodes. However, the MODE_IS database lock here is used solely to protect access to DatabaseImpl::_profile. Therefore, there is no need to conflict with batch application. Preventing such lock conflicts could increase throughput of operations that use the AutoStatsTracker. Many operations that use the AutoStatsTracker, such as secondary reads via the find command, are already configured to avoid conflicting with batch application inside AutoGetCollectionForRead: However, there are code paths which use the AutoStatsTracker but never use AutoGetCollectionForRead, causing unnecessary blocking during batch application while attempting to set the profiling level. This was observed in our performance testing infrastructure for the OP_KILL_CURSORS code path. |
| Comments |
| Comment by Githook User [ 11/Mar/19 ] |
|
Author: {'name': 'David Storch', 'username': 'dstorch', 'email': 'david.storch@10gen.com'}Message: |
| Comment by David Storch [ 11/Mar/19 ] |
|
william.schultz, while there may be other performance problems due to conflicting with secondary batch application when we don't need to, I don't intend to address those outside AutoStatsTracker within the scope of this ticket. Feel free to file and assign any new tickets to our backlog if you come across related performance problems! In the particular case you point out above, it looks like we're maintaining a UUID -> CollatorInterface cache. This means that we don't have to repeatedly determine the default collation, so it's not obvious to me that there will be a similar throughput issue for change streams on secondaries. |
| Comment by William Schultz (Inactive) [ 11/Mar/19 ] |
|
I believe that there is a similar issue with change stream post image lookup lock acquisitions. Inside MongoInterfaceStandalone::lookupSingleDocument, we will make a call to MongoInterfaceStandalone::_getCollectionDefaultCollator, which takes a collection lock via the AutoGetCollection interface. The AutoGetCollection constructor doesn't allow reads to avoid conflicting with secondary batch application. This behavior would presumably impact performance of secondary change streams that use 'updateLookup'. |