[SERVER-34994] The test secondary_reads_with_catalog_changes fails on timestamp safe unique index. Created: 15/May/18 Updated: 29/Oct/23 Resolved: 27/Jul/18 |
|
| Status: | Closed |
| Project: | Core Server |
| Component/s: | Storage |
| Affects Version/s: | None |
| Fix Version/s: | 4.1.2 |
| Type: | Improvement | Priority: | Major - P3 |
| Reporter: | Neha Khatri | Assignee: | Neha Khatri |
| Resolution: | Fixed | Votes: | 0 |
| Labels: | nonnyc, storage-engines | ||
| Remaining Estimate: | Not Specified | ||
| Time Spent: | Not Specified | ||
| Original Estimate: | Not Specified | ||
| Attachments: |
|
| Backwards Compatibility: | Fully Compatible |
| Sprint: | Storage Non-NYC 2018-07-16, Storage Engines 2018-07-30 |
| Participants: |
| Description |
|
The test fails with a duplicate key error in timestamp safe unique index on a primary.
Investigate cause of the failure. |
| Comments |
| Comment by Githook User [ 27/Jul/18 ] | |||||||
|
Author: {'name': 'nehakhatri5', 'email': 'neha.khatri@mongodb.com', 'username': 'nehakhatri5'}Message: Ignore the duplicate key error when inserting pre-existing index key in | |||||||
| Comment by Neha Khatri [ 25/Jul/18 ] | |||||||
|
Yes milkie, this situation occurs with old format unique indexes too. The old unique-index-record-insert has the logic that prevents duplicate key error here. We can have similar check logic in new index-record-insert logic too i.e. if the new index entry being inserted has identical key + RecordID as an exiting index entry then ignore the duplicate key error returned by WiredTiger insert. | |||||||
| Comment by Eric Milkie [ 24/Jul/18 ] | |||||||
|
I would imagine this situation would surface with old format unique indexes as well. To make background indexes work, the indexing code must ignore duplicate key errors on index-record-insert and ignore missing key errors on index-record-remove (unindex). Today's code already always ignores all missing key errors on unindex, although the logic is complicated (see wiredtiger_index.cpp:1599). The code also needs to ignore duplicate key errors for a background index build; the logic for it is also pretty complicated and there could be problems with the code in that area. | |||||||
| Comment by Neha Khatri [ 24/Jul/18 ] | |||||||
|
Thanks max.hirschhorn for inputs. With additional logging, I see that the background index build [conn125] and the insert thread tid:0=[conn110] are trying to insert the same index document in parallel:
I am trying to find that why only new format unique indexes are seeing this. | |||||||
| Comment by Max Hirschhorn [ 24/Jul/18 ] | |||||||
neha.khatri, milkie, I think there's a misunderstanding in what it is the secondary_reads_with_catalog_changes.js FSM workload is doing. While there is only one thread (with tid=0) inserting documents into the collection, any one of the other threads is able to drop the {x: 1} index if it exists or create it if it was dropped by some thread. Given that $config.threadCount=50, there are possibly 49 threads doing this at different times while the workload is running. I don't see any duplicate key errors if I change the workload to always build the {x: 1} in the foreground so I believe the uniqueness constraint is somehow being violated when the index is being built in the background. I hope that helps to point you in the right direction as you continue your investigation. | |||||||
| Comment by Neha Khatri [ 24/Jul/18 ] | |||||||
|
In the attached test log, I see that tid:0 = conn126
But there are other threads [conn94],[conn136],[conn77],[conn76],[conn103],[conn116],[conn61] that are attempting to insert index entries. The inserts from these non-tid:0 threads are resulting in duplicate key errors. e.g.
| |||||||
| Comment by Neha Khatri [ 23/Jul/18 ] | |||||||
|
milkie The above log snippet was from a local run. Same test failure was also seen in Evergreen here. Also attaching the log from the local run s34994.log | |||||||
| Comment by Max Hirschhorn [ 23/Jul/18 ] | |||||||
milkie, the thread ids are assigned by ThreadManager#spawnAll(). We increment the tid variable once for every call to makeThread(). | |||||||
| Comment by Eric Milkie [ 23/Jul/18 ] | |||||||
|
The fsm test appears to check if the "tid" field is 0, in order to determine if a particular thread should do inserts. Could it be that multiple threads have a tid of 0? I can't find where the tid field gets set. | |||||||
| Comment by Neha Khatri [ 23/Jul/18 ] | |||||||
|
The Duplicate Key error is seen on primary when inserting entries into unique index. The tests says the workload has a dedicated thread to write documents. However for index writes, it can be seen that more than one threads are attempting to write the same document. e.g see the index key {5384.0} . below, thread conn126 is inserting all the index entries, in between the thread conn136 inserting the already inserted index entry as shown:
Two threads attempting to insert identical index entries in a unique index is causing the failure.
|