[SERVER-81101] Speed up TransactionOperations::addTransactionOperation Created: 15/Sep/23 Updated: 10/Nov/23 Resolved: 10/Nov/23 |
|
| Status: | Closed |
| Project: | Core Server |
| Component/s: | None |
| Affects Version/s: | None |
| Fix Version/s: | 7.3.0-rc0 |
| Type: | Improvement | Priority: | Major - P3 |
| Reporter: | Matthew Russotto | Assignee: | Matthew Russotto |
| Resolution: | Fixed | Votes: | 0 |
| Labels: | auto-reverted, perf-tiger, perf-tiger-handoff | ||
| Remaining Estimate: | Not Specified | ||
| Time Spent: | Not Specified | ||
| Original Estimate: | Not Specified | ||
| Issue Links: |
|
||||
| Assigned Teams: |
Replication
|
||||
| Backwards Compatibility: | Fully Compatible | ||||
| Sprint: | Repl 2023-11-13 | ||||
| Participants: | |||||
| Linked BF Score: | 0 | ||||
| Description |
|
In TransactionOperations, we ensure we're not duplicating statement IDs by 1) Copying the unordered set containing the current set of IDs 2) Adding the IDs to that set and throwing an error if any are already there 3) Moving the new set to the old set. This is expensive; on my workstation, about 800ns for a single insert (starting from an empty ID set), 450us for 512 inserts, and 300ms for 262,144 inserts. If we require that statement IDs increase monotonically, we can replace this with a very simple check and basically save it all. But we can also check and insert the statement IDs to one set and just remove the ones we added in the exception case. Also if we don't remove it we should use an absl::btree_set instead of an unordered_set. |
| Comments |
| Comment by Githook User [ 09/Nov/23 ] | ||||||||||||||||
|
Author: {'name': 'Matthew T. Russotto', 'email': 'matthew.russotto@mongodb.com', 'username': 'mtrussotto'}Message: | ||||||||||||||||
| Comment by Matthew Russotto [ 09/Nov/23 ] | ||||||||||||||||
|
This was reverted because the std::bidirectional_iterator concept isn't supported in the XCode version we're using. Since that's just being used for a static_assert, I'll just remove the static_assert. | ||||||||||||||||
| Comment by Githook User [ 09/Nov/23 ] | ||||||||||||||||
|
Author: {'name': 'auto-revert-processor', 'email': 'dev-prod-dag@mongodb.com', 'username': ''}Message: Revert " This reverts commit 87f308646c7d1c6bf96b5086e96e6e39ea4665be. | ||||||||||||||||
| Comment by Githook User [ 08/Nov/23 ] | ||||||||||||||||
|
Author: {'name': 'Matthew Russotto', 'email': 'matthew.russotto@mongodb.com', 'username': 'mtrussotto'}Message: | ||||||||||||||||
| Comment by Matthew Russotto [ 02/Nov/23 ] | ||||||||||||||||
|
Using a structure designed to efficiently handle ranges.
So the question is whether the additional code is worth a couple of microseconds per 100 operations. I think yes. One, small operations are almost all overhead like this, so those couple of microseconds could be significant. Two, I think keeping track of all the statement IDs used or executed by an operation in order might be something we need for bulk writing vectored inserts, and an augmented version of this structure could help there. | ||||||||||||||||
| Comment by Matthew Russotto [ 02/Nov/23 ] | ||||||||||||||||
|
Using absl_flat_hash_set
| ||||||||||||||||
| Comment by Matthew Russotto [ 31/Oct/23 ] | ||||||||||||||||
|
Using an absl_set. Livable, though it's still kinda annoying that just checking statement IDs is so costly.
| ||||||||||||||||
| Comment by Matthew Russotto [ 31/Oct/23 ] | ||||||||||||||||
|
Not checking statement IDs at all: this is the limit for this particular optimization.
| ||||||||||||||||
| Comment by Matthew Russotto [ 31/Oct/23 ] | ||||||||||||||||
|
Baseline on master on an ARM workstation
|