[SERVER-56509] Wrap unique index insertion _keyExists call in a WT cursor reconfigure. Created: 30/Apr/21  Updated: 29/Oct/23  Resolved: 07/Jun/21

Status: Closed
Project: Core Server
Component/s: None
Affects Version/s: None
Fix Version/s: 4.4.7, 5.0.0-rc2, 5.1.0-rc0

Type: Improvement Priority: Major - P3
Reporter: Luke Pearson Assignee: Louis Williams
Resolution: Fixed Votes: 0
Labels: None
Remaining Estimate: Not Specified
Time Spent: Not Specified
Original Estimate: Not Specified

Attachments: PNG File Screen Shot 2021-05-24 at 9.44.50 PM.png     PNG File image-2021-05-26-13-41-06-320.png     File repro.js    
Issue Links:
Backports
Depends
depends on WT-7264 Creating a new configuration for sear... Closed
is depended on by SERVER-57095 Increase number of zones in reshardin... Closed
Problem/Incident
causes SERVER-58936 Unique index constraints may not be e... Closed
Related
related to WT-7653 Inconsistent update performance on un... Backlog
related to SERVER-57221 Inconsistent delete performance on co... Closed
related to SERVER-58943 Add more test coverage for unique ind... Closed
is related to SERVER-56274 TTL deletes are much slower on descen... Closed
is related to SERVER-61185 Use prefix_search for unique index lo... Closed
Backwards Compatibility: Fully Compatible
Backport Requested:
v5.0, v4.4
Sprint: Execution Team 2021-06-14
Participants:
Case:
Linked BF Score: 180

 Description   

Follow on work from WT-7264 which is adding a new cursor flag, the flag allows the search near to exit early instead of walking numerous not visible entries.

The patch I was using when testing the change is:

diff --git a/src/mongo/db/storage/wiredtiger/wiredtiger_index.cpp b/src/mongo/db/storage/wiredtiger/wiredtiger_index.cpp
index 87bda0ace9..7eef9c8528 100644
--- a/src/mongo/db/storage/wiredtiger/wiredtiger_index.cpp
+++ b/src/mongo/db/storage/wiredtiger/wiredtiger_index.cpp
@@ -1520,9 +1520,11 @@ Status WiredTigerIndexUnique::_insert(OperationContext* opCtx,
         setKey(c, prefixKeyItem.Get());
         ret = WT_OP_CHECK(wiredTigerCursorRemove(opCtx, c));
         invariantWTOK(ret);
-
+        c->reconfigure(c, "prefix_key=true");
         // Second phase looks up for existence of key to avoid insertion of duplicate key
-        if (_keyExists(opCtx, c, keyString.getBuffer(), sizeWithoutRecordId)) {
+        auto key_exists = _keyExists(opCtx, c, keyString.getBuffer(), sizeWithoutRecordId);
+        c->reconfigure(c, "prefix_key=false");
+        if (key_exists) {
             auto key = KeyString::toBson(
                 keyString.getBuffer(), sizeWithoutRecordId, _ordering, keyString.getTypeBits());
             auto entry = _desc->getEntry(); 

I think that should be all the required changes.



 Comments   
Comment by Githook User [ 28/Jul/21 ]

Author:

{'name': 'Henrik Edin', 'email': 'henrik.edin@mongodb.com', 'username': 'henrikedin'}

Message: Revert "SERVER-56509 Wrap unique index insertion _keyExists call in a WT cursor reconfigure"

This reverts commit c5ac2eb1ea145693e1c6b974e88a2cfc18780134.
Branch: master
https://github.com/mongodb/mongo/commit/ecba23449a26f3d266ffacda7eb98d8386267606

Comment by Githook User [ 28/Jul/21 ]

Author:

{'name': 'Henrik Edin', 'email': 'henrik.edin@mongodb.com', 'username': 'henrikedin'}

Message: Revert "SERVER-56509 Wrap unique index insertion _keyExists call in a WT cursor reconfigure"

This reverts commit ae2da27652e552f101559466d165b82a3c122d71.
Branch: v4.4
https://github.com/mongodb/mongo/commit/83b8bb8b6b325d8d8d3dfd2ad9f744bdad7d6ca0

Comment by Githook User [ 28/Jul/21 ]

Author:

{'name': 'Henrik Edin', 'email': 'henrik.edin@mongodb.com', 'username': 'henrikedin'}

Message: Revert "SERVER-56509 Wrap unique index insertion _keyExists call in a WT cursor reconfigure"

This reverts commit 297e2977ef3e394e02d61aedc954c9aaadc37e73.
Branch: v5.0
https://github.com/mongodb/mongo/commit/6d9ec525e78465dcecadcff99cce953d380fedc8

Comment by Githook User [ 23/Jun/21 ]

Author:

{'name': 'Louis Williams', 'email': 'louis.williams@mongodb.com', 'username': 'louiswilliams'}

Message: SERVER-56509 Wrap unique index insertion _keyExists call in a WT cursor reconfigure

This improves, but does not entirely fix a performance regression with
inserts and updates on unique indexes

(cherry picked from commit c5ac2eb1ea145693e1c6b974e88a2cfc18780134)
(cherry picked from commit 297e2977ef3e394e02d61aedc954c9aaadc37e73)
Branch: v4.4
https://github.com/mongodb/mongo/commit/ae2da27652e552f101559466d165b82a3c122d71

Comment by Githook User [ 15/Jun/21 ]

Author:

{'name': 'Louis Williams', 'email': 'louis.williams@mongodb.com', 'username': 'louiswilliams'}

Message: SERVER-56509 Wrap unique index insertion _keyExists call in a WT cursor reconfigure

This improves, but does not entirely fix a performance regression with
inserts and updates on unique indexes

(cherry picked from commit c5ac2eb1ea145693e1c6b974e88a2cfc18780134)
Branch: v5.0
https://github.com/mongodb/mongo/commit/297e2977ef3e394e02d61aedc954c9aaadc37e73

Comment by Githook User [ 07/Jun/21 ]

Author:

{'name': 'Louis Williams', 'email': 'louis.williams@mongodb.com', 'username': 'louiswilliams'}

Message: SERVER-56509 Wrap unique index insertion _keyExists call in a WT cursor reconfigure

This improves, but does not entirely fix a performance regression with
inserts and updates on unique indexes
Branch: master
https://github.com/mongodb/mongo/commit/c5ac2eb1ea145693e1c6b974e88a2cfc18780134

Comment by Louis Williams [ 07/Jun/21 ]

I split this ticket off into SERVER-57500 which tracks the remaining performance regressions.

Comment by Luke Pearson [ 31/May/21 ]

I have discussed this with the team and we are planning on scoping out a project to handle further search_near optimisations including the second regression identified by Louis.

Comment by Luke Pearson [ 26/May/21 ]

I did notice that we're doing a lot of nexts and prevs in the insert list path which wasn't optimised for prefix searching, so I added a check there and resolved a lot of regression but also made the insertion phase worse.

> load('server-repro.js')
starting bulk load on unique
inserted 10000
Wed May 26 2021 15:34:17 GMT+1000 (AEST)...bulk load on unique: 2180ms
starting update on unique
Wed May 26 2021 15:34:19 GMT+1000 (AEST)...update on unique: 378ms
true
> 

It is expected that comparing keys with prefixes would be slow which is why I opted not to include it in the insert list scenario. Given that the update case is now quick I'm guessing that it too is taking the insert list path, but in the normal case why is it's insert list checking taking 8 seconds of time, reduced to 300ms with my fix?

Comment by Luke Pearson [ 26/May/21 ]

That is very interesting, and the I have run the reproducer and see the same regression. I've spend some time comparing the unique index scenario with the non unique scenario and also took a look at the FTDC data from both.

From the FTDC it's clear that the prefix search near code isn't coming into affect until we begin the update portion of the workload which is after the ~10 second hang:

This is curious as I'm fairly certain the insert path is going through the same function. I've also wrapped the isDup call to _keyExists in a reconfigure for prefix search and not seen any positive impact from that.

I'm not clear on where these next and prev calls are originating from so I'll continue to try and narrow that down.

 

Comment by Louis Williams [ 25/May/21 ]

I wrote a reproducer (attached as "repro.js") for the multi-update problem, and the "prefix_key" change does not appear to have completely resolved the regression related to updates on unique indexes. Locally, a multi-update on a non-unique index field for 10K docs took about 250ms. The same update on a unique field took about 15s without the "prefix_key" change and about 8s with the "prefix_key" change.

I also discovered that the regression only exists if a unique field is updated to a value that sorts before the original key. For example, updating an index field from "ZZZ" to "AAA" triggers the regression, but updating the same field from "AAA" to "ZZZ" does not. I assume this is because of the way MongoDB constructs KeyStrings for seeking when calling _keyExists.

Perf screenshot with the "prefix_key" change applied:

I have not been able to reproduce a problem with multi-deletes.

Comment by Luke Pearson [ 25/May/21 ]

Sorry for the delay to reply, I'll bring the conversation from WT-7264 over to here so its all in one place.

luke.pearson is the original regression only a problem when calling search_near() after inserting+removing within the same transaction? Or would this regression also apply to readers that call search_near() in subsequent transactions? My intuition says that this would be a problem for other transactions since these tombstones must be traversed

Given that we can't guarantee that it'll land on the prefix in other transactions we can't guarantee the current implementation of prefix search near will work correctly, we can look at extending it to handle other scenarios. I'll raise this with brian.lane

luke.pearson What is the plan for that ticket? Do I understand correctly that the WT-7264 fix we were waiting for TPC-C performance issues is not working until that ticket would be done?

SERVER-56509 needs to be completed before the prefix search near fix will take effect, however I cannot confirm that it will resolve the TPC-C regression as we discussed separately on the BF.

Comment by Louis Williams [ 20/May/21 ]

luke.pearson, related to my recent comment in WT-7264, does it make sense to also apply this optimization when seeking for keys, not just for insertion?

Comment by Alex Podelko (Inactive) [ 19/May/21 ]

luke.pearson What is the plan for that ticket? Do I understand correctly that the WT-7264 fix we were waiting for TPC-C performance issues is not working until that ticket would be done?

Comment by Luke Pearson [ 04/May/21 ]

The related ticket has been merged to mongodb-master.

Generated at Thu Feb 08 05:39:27 UTC 2024 using Jira 9.7.1#970001-sha1:2222b88b221c4928ef0de3161136cc90c8356a66.