[SERVER-47344] agg_merge_upsert_supplied_cluster.js failure due to unsupported downgrade version Created: 06/Apr/20  Updated: 29/Oct/23  Resolved: 22/Apr/20

Status: Closed
Project: Core Server
Component/s: None
Affects Version/s: None
Fix Version/s: 4.4.0-rc3

Type: Bug Priority: Major - P3
Reporter: Tammy Bailey (Inactive) Assignee: Bernard Gorman
Resolution: Fixed Votes: 0
Labels: qexec-team
Remaining Estimate: Not Specified
Time Spent: Not Specified
Original Estimate: Not Specified

Issue Links:
Depends
depends on WT-5966 4.4 downgrade can result in 4.2 core ... Closed
Related
related to SERVER-47425 When 4.2 discovers log version 4 reco... Closed
is related to WT-5926 __verify_txn_addr_cmp failure in mult... Closed
Backwards Compatibility: Fully Compatible
Operating System: ALL
Sprint: Query 2020-04-20, Query 2020-05-04
Participants:
Linked BF Score: 50

 Description   

The JS test agg_merge_upsert_supplied_cluster.js is failing in multiversion testing due to (at least in part) using an illegal downgrade version for testing. The test performs upgrade/downgrade scenarios between 4.4 and 4.2.1, which is not a supported path. I believe the supported downgrade version is 4.2.6.

 



 Comments   
Comment by Githook User [ 22/Apr/20 ]

Author:

{'name': 'Bernard Gorman', 'email': 'bernard.gorman@gmail.com', 'username': 'gormanb'}

Message: SERVER-47344 Change agg_merge_upsert_supplied_cluster.js to avoid downgrade to 4.2.1 SERVER-47581 Set 'useNewUpsert' on $mergeCursors aggregations
Branch: v4.4
https://github.com/mongodb/mongo/commit/dda2fb45cbf624c9270f8fad7f3c5c5a2f0834eb

Comment by Kelsey Schubert [ 16/Apr/20 ]

bernard.gorman, do you have an idea of when this will be ready to push into 4.4? I'd like to the required builders green on 4.4.

Comment by Alexander Gorrod [ 13/Apr/20 ]

daniel.gottlieb sorry for the slow reply. Writing out version 4 log files after seeing one in 4.2 is clever. It's a behavior change we wouldn't normally introduce in a dot release, but I don't see any downsides. It seems we have test coverage for it, and the failure won't be subtle if we get it wrong.

Comment by Luke Chen [ 09/Apr/20 ]

daniel.gottlieb Hope to let you know WT-5966 had been vendor-ed into mongo v4.2 branch. 

Comment by Daniel Gottlieb (Inactive) [ 08/Apr/20 ]

No - that is expected to fail, and we have no way of making it fail elegantly.

I do think there's a cleaner way to fail, though because 4.2.1 -> 4.2.4/5 are already released, we can't put in a better log message:

So long as the require_min/require_max compatibility works in 4.2.6 for seeing a log version 4 record via a downgrade (compatibility: 3.3), 4.2.6 could choose compatibility=(release=3.3) instead of compatibility=(release=3.2). This would cause 4.2.6 that's working on 4.4 datafiles to only write out log version 4, preventing an earlier 4.2 release from getting passed startup.

Comment by Alexander Gorrod [ 08/Apr/20 ]

is 4.4 -> 4.2.6 -> 4.2.1 expected to work?

No - that is expected to fail, and we have no way of making it fail elegantly. Once a user has opened their database with 4.4, it is no longer safe for them to open the database with a version of MongoDB earlier than 4.2.6 - even stepping through 4.2.6. cc kelsey.schubert and brian.lane I hope that matches your expectations?

Comment by Daniel Gottlieb (Inactive) [ 08/Apr/20 ]

I tried running this patched version with keith.bostic's fix for 4.2.6.

However, the downgrade process of 4.4 -> 4.2.6 -> 4.2.1 (to reset the test) appears to fail:

[js_test:agg_merge_upsert_supplied_cluster] 2020-04-08T15:06:56.279-0400 c20027| 2020-04-08T15:06:56.278-0400 F  -        [initandlisten] Invalid access at address: 0
[js_test:agg_merge_upsert_supplied_cluster] 2020-04-08T15:06:56.306-0400 c20027| 2020-04-08T15:06:56.305-0400 F  -        [initandlisten] Got signal: 11 (Segmentation fault).
<snip>
[js_test:agg_merge_upsert_supplied_cluster] 2020-04-08T15:06:56.306-0400 c20027| ----- BEGIN BACKTRACE -----
<snip>
[js_test:agg_merge_upsert_supplied_cluster] 2020-04-08T15:06:56.306-0400 c20027|  mongod-4.2.1(_ZN5mongo15printStackTraceERSo+0x41) [0x55f5621bbff1]
[js_test:agg_merge_upsert_supplied_cluster] 2020-04-08T15:06:56.306-0400 c20027|  mongod-4.2.1(+0x279E7EE) [0x55f5621bb7ee]
[js_test:agg_merge_upsert_supplied_cluster] 2020-04-08T15:06:56.306-0400 c20027|  mongod-4.2.1(+0x279E9CC) [0x55f5621bb9cc]
[js_test:agg_merge_upsert_supplied_cluster] 2020-04-08T15:06:56.306-0400 c20027|  libpthread.so.0(+0x11390) [0x7fda0d554390]
[js_test:agg_merge_upsert_supplied_cluster] 2020-04-08T15:06:56.306-0400 c20027|  mongod-4.2.1(+0xF2CCDE) [0x55f560949cde]
[js_test:agg_merge_upsert_supplied_cluster] 2020-04-08T15:06:56.306-0400 c20027|  mongod-4.2.1(__wt_block_buffer_to_addr+0x18) [0x55f56094a4c8]
[js_test:agg_merge_upsert_supplied_cluster] 2020-04-08T15:06:56.306-0400 c20027|  mongod-4.2.1(__wt_bm_read+0x45) [0x55f560934045]
[js_test:agg_merge_upsert_supplied_cluster] 2020-04-08T15:06:56.306-0400 c20027|  mongod-4.2.1(__wt_bt_read+0x375) [0x55f56087b555]
[js_test:agg_merge_upsert_supplied_cluster] 2020-04-08T15:06:56.306-0400 c20027|  mongod-4.2.1(+0xE5FC7F) [0x55f56087cc7f]
[js_test:agg_merge_upsert_supplied_cluster] 2020-04-08T15:06:56.306-0400 c20027|  mongod-4.2.1(__wt_row_leaf_key_work+0x1EDD) [0x55f5608bab7d]
[js_test:agg_merge_upsert_supplied_cluster] 2020-04-08T15:06:56.306-0400 c20027|  mongod-4.2.1(__wt_row_search+0x129F) [0x55f5608c03df]
[js_test:agg_merge_upsert_supplied_cluster] 2020-04-08T15:06:56.306-0400 c20027|  mongod-4.2.1(__wt_btcur_insert+0xFC2) [0x55f560870f42]
[js_test:agg_merge_upsert_supplied_cluster] 2020-04-08T15:06:56.306-0400 c20027|  mongod-4.2.1(+0xEB9C7C) [0x55f5608d6c7c]
[js_test:agg_merge_upsert_supplied_cluster] 2020-04-08T15:06:56.306-0400 c20027|  mongod-4.2.1(+0xE41D63) [0x55f56085ed63]
[js_test:agg_merge_upsert_supplied_cluster] 2020-04-08T15:06:56.307-0400 c20027|  mongod-4.2.1(+0xE42646) [0x55f56085f646]
[js_test:agg_merge_upsert_supplied_cluster] 2020-04-08T15:06:56.307-0400 c20027|  mongod-4.2.1(__wt_log_scan+0xCE0) [0x55f5608fa010]
[js_test:agg_merge_upsert_supplied_cluster] 2020-04-08T15:06:56.307-0400 c20027|  mongod-4.2.1(__wt_txn_recover+0x3A7) [0x55f56085fdf7]
[js_test:agg_merge_upsert_supplied_cluster] 2020-04-08T15:06:56.307-0400 c20027|  mongod-4.2.1(__wt_connection_workers+0x37) [0x55f5607f4307]
[js_test:agg_merge_upsert_supplied_cluster] 2020-04-08T15:06:56.307-0400 c20027|  mongod-4.2.1(wiredtiger_open+0x22D2) [0x55f5607f16b2]
[js_test:agg_merge_upsert_supplied_cluster] 2020-04-08T15:06:56.307-0400 c20027|  mongod-4.2.1(_ZN5mongo18WiredTigerKVEngine15_openWiredTigerERKNSt7__cxx1112basic_stringIcSt11char_traitsIcESaIcEEES8_+0xB7) [0x55f5607b64c7]
[js_test:agg_merge_upsert_supplied_cluster] 2020-04-08T15:06:56.307-0400 c20027|  mongod-4.2.1(_ZN5mongo18WiredTigerKVEngineC2ERKNSt7__cxx1112basic_stringIcSt11char_traitsIcESaIcEEES8_PNS_11ClockSourceES8_mmbbbb+0x7F1) [0x55f5607b8051]
[js_test:agg_merge_upsert_supplied_cluster] 2020-04-08T15:06:56.307-0400 c20027|  mongod-4.2.1(+0xD6EEB3) [0x55f56078beb3]
[js_test:agg_merge_upsert_supplied_cluster] 2020-04-08T15:06:56.307-0400 c20027|  mongod-4.2.1(_ZN5mongo23initializeStorageEngineEPNS_14ServiceContextENS_22StorageEngineInitFlagsE+0x52F) [0x55f560f7c18f]
[js_test:agg_merge_upsert_supplied_cluster] 2020-04-08T15:06:56.307-0400 c20027|  mongod-4.2.1(+0xD4985A) [0x55f56076685a]
[js_test:agg_merge_upsert_supplied_cluster] 2020-04-08T15:06:56.307-0400 c20027|  mongod-4.2.1(+0xD4CEBD) [0x55f560769ebd]
[js_test:agg_merge_upsert_supplied_cluster] 2020-04-08T15:06:56.307-0400 c20027|  mongod-4.2.1(+0xCD2519) [0x55f5606ef519]
[js_test:agg_merge_upsert_supplied_cluster] 2020-04-08T15:06:56.307-0400 c20027|  libc.so.6(__libc_start_main+0xF0) [0x7fda0d199830]
[js_test:agg_merge_upsert_supplied_cluster] 2020-04-08T15:06:56.307-0400 c20027|  mongod-4.2.1(_start+0x29) [0x55f560765479]
[js_test:agg_merge_upsert_supplied_cluster] 2020-04-08T15:06:56.307-0400 c20027| -----  END BACKTRACE  -----

alexander.gorrod is 4.4 -> 4.2.6 -> 4.2.1 expected to work?

Comment by Daniel Gottlieb (Inactive) [ 08/Apr/20 ]

I talked with bernard.gorman about this test. We believe applying the following patch works:

diff --git a/jstests/multiVersion/agg_merge_upsert_supplied_cluster.js b/jstests/multiVersion/agg_merge_upsert_supplied_cluster.js
index c44bb1688c..f48542b1b5 100644
--- a/jstests/multiVersion/agg_merge_upsert_supplied_cluster.js
+++ b/jstests/multiVersion/agg_merge_upsert_supplied_cluster.js
@@ -218,10 +218,11 @@ for (let testCaseNum = 0; testCaseNum < testCases.length; ++testCaseNum) {
     // Finally, downgrade the cluster to pre-backport 4.2 in preparation for the next test case. No
     // need to do this after the final test, as it will simply extend the runtime for no reason.
     if (testCaseNum < testCases.length - 1) {
+        refreshCluster("4.2", {upgradeMongos: true, upgradeShards: true, upgradeConfigs: true});
         refreshCluster(preBackport42Version,
                        {upgradeMongos: true, upgradeShards: true, upgradeConfigs: true});
     }
 }
 
 st.stop();
-})();
\ No newline at end of file
+})();

However, I'm finding data corruption (collection/index out of sync) on downgrade. I expect this is related to WT-5966. Truly knowing whether this patch is sufficient will require ironing out 4.4 -> 4.2 downgrade bugs.

Generated at Thu Feb 08 05:13:54 UTC 2024 using Jira 9.7.1#970001-sha1:2222b88b221c4928ef0de3161136cc90c8356a66.