[SERVER-28093] MongoDB 3.4 fails to start with Illegal instruction on s390x z12 Created: 24/Feb/17  Updated: 31/May/17  Resolved: 28/Feb/17

Status: Closed
Project: Core Server
Component/s: None
Affects Version/s: 3.4.0, 3.4.1, 3.4.2
Fix Version/s: None

Type: Bug Priority: Major - P3
Reporter: Michael Höller Assignee: Andrew Morrow (Inactive)
Resolution: Duplicate Votes: 0
Labels: None
Remaining Estimate: Not Specified
Time Spent: Not Specified
Original Estimate: Not Specified

Issue Links:
Duplicate
duplicates SERVER-27963 Disable CRC32 hardware support on s39... Closed
Backwards Compatibility: Fully Compatible
Operating System: ALL
Steps To Reproduce:

Download the mongodb sourses and untar them:
wget https://downloads.mongodb.com/linux/mongodb-linux-s390x-enterprise-rhel72-3.4.1.tgz
tar -xvf mongodb-linux-s390x-enterprise-rhel72-3.4.1.tgz
cd to the bin folder
provide a db path mkdir -p /tmp/data/db
run a mongod: mongod --dbpath /tmp/data/db/
please find attache the dump plus backtrace

[mhoeller@rhel71-mongo2 bin]# ./mongod --dbpath /tmp/data/db/
2017-02-24T05:27:35.184-0500 I CONTROL  [initandlisten] MongoDB starting : pid=5365 port=27017 dbpath=/tmp/data/db/ 64-bit host=rhel71-mongo2
2017-02-24T05:27:35.184-0500 I CONTROL  [initandlisten] db version v3.4.1
2017-02-24T05:27:35.184-0500 I CONTROL  [initandlisten] git version: 5e103c4f5583e2566a45d740225dc250baacfbd7
2017-02-24T05:27:35.184-0500 I CONTROL  [initandlisten] OpenSSL version: OpenSSL 1.0.1e-fips 11 Feb 2013
2017-02-24T05:27:35.184-0500 I CONTROL  [initandlisten] allocator: tcmalloc
2017-02-24T05:27:35.184-0500 I CONTROL  [initandlisten] modules: enterprise 
2017-02-24T05:27:35.184-0500 I CONTROL  [initandlisten] build environment:
2017-02-24T05:27:35.184-0500 I CONTROL  [initandlisten]     distmod: rhel72
2017-02-24T05:27:35.184-0500 I CONTROL  [initandlisten]     distarch: s390x
2017-02-24T05:27:35.184-0500 I CONTROL  [initandlisten]     target_arch: s390x
2017-02-24T05:27:35.184-0500 I CONTROL  [initandlisten] options: { storage: { dbPath: "/tmp/data/db/" } }
2017-02-24T05:27:35.191-0500 I -        [initandlisten] Detected data files in /tmp/data/db/ created by the 'wiredTiger' storage engine, so setting the active storage engine to 'wiredTiger'.
2017-02-24T05:27:35.191-0500 I STORAGE  [initandlisten] wiredtiger_open config: create,cache_size=7532M,session_max=20000,eviction=(threads_max=4),config_base=false,statistics=(fast),log=(enabled=true,archive=true,path=journal,compressor=snappy),file_manager=(close_idle_time=100000),checkpoint=(wait=60,log_size=2GB),statistics_log=(wait=0),
2017-02-24T05:27:35.201-0500 F -        [initandlisten] Invalid operation at address: 0x2aa022baf5e
2017-02-24T05:27:35.204-0500 F -        [initandlisten] Got signal: 4 (Illegal instruction).
 
 0x2aa0199b00c 0x2aa0199a028 0x2aa0199a6a2 0x3ffffa71118 0x2aa022baf64
----- BEGIN BACKTRACE -----
{"backtrace":[{"b":"2AA00239000","o":"176200C","s":"_ZN5mongo15printStackTraceERSo"},{"b":"2AA00239000","o":"1761028"},{"b":"2AA00239000","o":"17616A2"},{"b":"0","o":"3FFFFA71118"},{"b":"2AA00239000","o":"2081F64"}],"processInfo":{ "mongodbVersion" : "3.4.1", "gitVersion" : "5e103c4f5583e2566a45d740225dc250baacfbd7", "compiledModules" : [ "enterprise" ], "uname" : { "sysname" : "Linux", "release" : "3.10.0-327.18.2.el7.s390x", "version" : "#1 SMP Fri Apr 8 05:12:00 EDT 2016", "machine" : "s390x" }, "somap" : [ { "b" : "2AA00239000", "elfType" : 3, "buildId" : "CF6C7EF422BC03E4C5362A4EC1564B2BCA2D8ADF" }, { "b" : "3FFFD667000", "elfType" : 3, "buildId" : "43017C08E2C27E59D7CDF8176D7B32F1A2BDCA2B" }, { "b" : "3FFFD39A000", "path" : "/lib64/libnetsnmpmibs.so.31", "elfType" : 3, "buildId" : "DAC63208858A869367BFC2F35C1E96A6823F7077" }, { "b" : "3FFFD395000", "path" : "/lib64/libdl.so.2", "elfType" : 3, "buildId" : "60F552516209C91EF1582D45F3A5D1D50C7A8D33" }, { "b" : "3FFFD324000", "path" : "/lib64/librpm.so.3", "elfType" : 3, "buildId" : "FE61553BEBBE962C552A2FDCF0CF0A94978F5DF3" }, { "b" : "3FFFD2F1000", "path" : "/lib64/librpmio.so.3", "elfType" : 3, "buildId" : "89C0448FE642CBC76DB971BBC0FD90D1A5A2881B" }, { "b" : "3FFFD27F000", "path" : "/lib64/libnetsnmpagent.so.31", "elfType" : 3, "buildId" : "257CC42FE8DBDE860399328330F62558A2889709" }, { "b" : "3FFFD272000", "path" : "/lib64/libwrap.so.0", "elfType" : 3, "buildId" : "97B6152FADEA9B67FD85208A212270B083B4D3D3" }, { "b" : "3FFFD166000", "path" : "/lib64/libnetsnmp.so.31", "elfType" : 3, "buildId" : "5C81DFECEF620E2FF07F7F2D4355FC249A763DCF" }, { "b" : "3FFFD0F4000", "path" : "/lib64/libssl.so.10", "elfType" : 3, "buildId" : "2AB6B2DE37A593F81AA88EAF1859CC1749D11E44" }, { "b" : "3FFFCF00000", "path" : "/lib64/libcrypto.so.10", "elfType" : 3, "buildId" : "0E0652D474738FCC4F1790D5B02EF05840E59C4C" }, { "b" : "3FFFCEDF000", "path" : "/lib64/libsasl2.so.3", "elfType" : 3, "buildId" : "2CCF600172DCCE6D35F6822C69B036F2296C202D" }, { "b" : "3FFFCE85000", "path" : "/lib64/libldap-2.4.so.2", "elfType" : 3, "buildId" : "8641F7E819814C9F56A5E743A9F7A841130CD8D2" }, { "b" : "3FFFCE73000", "path" : "/lib64/liblber-2.4.so.2", "elfType" : 3, "buildId" : "3CA911DAB40AABAACA649A13EB0CF8E76443E00C" }, { "b" : "3FFFCE03000", "path" : "/lib64/libcurl.so.4", "elfType" : 3, "buildId" : "905107DC9BBC73C238EF5DE7F6125F32CA0535DA" }, { "b" : "3FFFCDB5000", "path" : "/lib64/libgssapi_krb5.so.2", "elfType" : 3, "buildId" : "E38373598E0E7E519AA8A4CFD2D1830EFDD5E8A7" }, { "b" : "3FFFCD0B000", "path" : "/lib64/libm.so.6", "elfType" : 3, "buildId" : "404C8A7E6003FFABA9C11167AACCE19BA53BCDF3" }, { "b" : "3FFFCD00000", "path" : "/lib64/librt.so.1", "elfType" : 3, "buildId" : "BFE35DECB3578DDF2FCB55956E11AF367E62D645" }, { "b" : "3FFFCCEE000", "path" : "/lib64/libgcc_s.so.1", "elfType" : 3, "buildId" : "4A344AC9010C49DA503F43B5AF58997D63AC7555" }, { "b" : "3FFFCCD0000", "path" : "/lib64/libpthread.so.0", "elfType" : 3, "buildId" : "C7EA38ED674080E366B2B086481F5538B665BB2A" }, { "b" : "3FFFCB2E000", "path" : "/lib64/libc.so.6", "elfType" : 3, "buildId" : "892086DA74F42AB29AA2507AA6F9C6953BBBED89" }, { "b" : "3FFFD669000", "path" : "/lib/ld64.so.1", "elfType" : 3, "buildId" : "1C54437965D4CCF1FD39AAD021C6E8595FB4AACC" }, { "b" : "3FFFC976000", "path" : "/usr/lib64/perl5/CORE/libperl.so", "elfType" : 3, "buildId" : "6B41260255E6E61AD0474AA7F03918AC49578504" }, { "b" : "3FFFC959000", "path" : "/lib64/libresolv.so.2", "elfType" : 3, "buildId" : "B213D2332C99B6FA64F06E0558E3724749838D86" }, { "b" : "3FFFC93B000", "path" : "/lib64/libnsl.so.1", "elfType" : 3, "buildId" : "C47DC420F077319086D6062DA0E8064A49D4253A" }, { "b" : "3FFFC8FE000", "path" : "/lib64/libcrypt.so.1", "elfType" : 3, "buildId" : "F478ACF1E8D59690056C634D8DCEA8D468665F47" }, { "b" : "3FFFC8F9000", "path" : "/lib64/libutil.so.1", "elfType" : 3, "buildId" : "4C758A8EC2F91E09A0A5E704D60156BCD61B8F19" }, { "b" : "3FFFC7D5000", "path" : "/lib64/libnss3.so", "elfType" : 3, "buildId" : "DB9B88CDF8BBD42EBE5552440890340B25F5058B" }, { "b" : "3FFFC7C0000", "path" : "/lib64/libbz2.so.1", "elfType" : 3, "buildId" : "2BE3C75DF494579EB8083C5C18593F860007B22A" }, { "b" : "3FFFC7A7000", "path" : "/lib64/libz.so.1", "elfType" : 3, "buildId" : "EB957F2CF831B90B908EBC1D662CE43F78F9EEC4" }, { "b" : "3FFFC78A000", "path" : "/lib64/libelf.so.1", "elfType" : 3, "buildId" : "6AA177A662DF7E719E2591EE6F456E1A88A16BFB" }, { "b" : "3FFFC760000", "path" : "/lib64/liblzma.so.5", "elfType" : 3, "buildId" : "AB08ABF87F024731DFBF7073076E2C4EA34CD388" }, { "b" : "3FFFC754000", "path" : "/lib64/libpopt.so.0", "elfType" : 3, "buildId" : "7EED836ACBD633A305A2BF38BB92D60129955915" }, { "b" : "3FFFC72A000", "path" : "/lib64/libselinux.so.1", "elfType" : 3, "buildId" : "876795C57DF5BF8829D6F84F85FE11290B0C87F3" }, { "b" : "3FFFC723000", "path" : "/lib64/libcap.so.2", "elfType" : 3, "buildId" : "B1F9C9853C13A5FD3012DC3EA12BA06962A80934" }, { "b" : "3FFFC719000", "path" : "/lib64/libacl.so.1", "elfType" : 3, "buildId" : "C1DA365A229259B58F93745508368CE79F8F60A7" }, { "b" : "3FFFC6E4000", "path" : "/lib64/liblua-5.1.so", "elfType" : 3, "buildId" : "EFA4D9E393A50E4E5DCCD7E27A7B8C8718EA3291" }, { "b" : "3FFFC516000", "path" : "/lib64/libdb-5.3.so", "elfType" : 3, "buildId" : "FE479525DA5BEA463B8B22AC4BF0045ADF2C9F4C" }, { "b" : "3FFFC422000", "path" : "/lib64/libkrb5.so.3", "elfType" : 3, "buildId" : "2E9EDC17C542B3B6E1D24AF3C4C66865F4806309" }, { "b" : "3FFFC41D000", "path" : "/lib64/libcom_err.so.2", "elfType" : 3, "buildId" : "4C644BED6DFE8EC62DDC19BD973C6F61D212C1FF" }, { "b" : "3FFFC3E8000", "path" : "/lib64/libk5crypto.so.3", "elfType" : 3, "buildId" : "53F72D5943FFA1FBBADA7C4A18CA22DB93CBE2A1" }, { "b" : "3FFFC3A0000", "path" : "/lib64/libssl3.so", "elfType" : 3, "buildId" : "E39FB1BBF66AFBD0AC5B122B2480F5B566BA618B" }, { "b" : "3FFFC375000", "path" : "/lib64/libsmime3.so", "elfType" : 3, "buildId" : "7126E6B4D6B03C0A911BB715E4891466260CC354" }, { "b" : "3FFFC345000", "path" : "/lib64/libnssutil3.so", "elfType" : 3, "buildId" : "5CA3D371EBA025107DD75C1E86CFF576A783E873" }, { "b" : "3FFFC33F000", "path" : "/lib64/libplds4.so", "elfType" : 3, "buildId" : "CBCE1137588626AFCEBD70A8CA8854846E584186" }, { "b" : "3FFFC339000", "path" : "/lib64/libplc4.so", "elfType" : 3, "buildId" : "B33DDEC0F749BBD89AE08B68A8C0B3E64D68D044" }, { "b" : "3FFFC2F2000", "path" : "/lib64/libnspr4.so", "elfType" : 3, "buildId" : "122223F99E6BC81302BE75EF214731F09886FE17" }, { "b" : "3FFFC2BD000", "path" : "/lib64/libidn.so.11", "elfType" : 3, "buildId" : "204ECBE872A500C308942F3445FA78DC63E5423C" }, { "b" : "3FFFC291000", "path" : "/lib64/libssh2.so.1", "elfType" : 3, "buildId" : "B41C5745CE9C10A08ACBA6AEE51A72A0773E8D48" }, { "b" : "3FFFC280000", "path" : "/lib64/libkrb5support.so.0", "elfType" : 3, "buildId" : "AFC03845EF5A69D358AAF8F8E6A30C47F76DB531" }, { "b" : "3FFFC27A000", "path" : "/lib64/libkeyutils.so.1", "elfType" : 3, "buildId" : "5C364755C69EDB23BA9C3AAE23390A8CB67A5060" }, { "b" : "3FFFC276000", "path" : "/lib64/libfreebl3.so", "elfType" : 3, "buildId" : "F08FF2C220D997AE177C871B8921591B2C4B11E8" }, { "b" : "3FFFC230000", "path" : "/lib64/libpcre.so.1", "elfType" : 3, "buildId" : "19CE30E7CB33F9DBF421E2329D245608971AA9D9" }, { "b" : "3FFFC229000", "path" : "/lib64/libattr.so.1", "elfType" : 3, "buildId" : "4B4DC88F908C231178FF24B06AEFA75605F86F16" } ] }}
 mongod(_ZN5mongo15printStackTraceERSo+0x5C) [0x2aa0199b00c]
 mongod(+0x1761028) [0x2aa0199a028]
 mongod(+0x17616A2) [0x2aa0199a6a2]
 ??? [0x3ffffa71118]
 mongod(+0x2081F64) [0x2aa022baf64]
-----  END BACKTRACE  -----
Illegal instruction (core dumped)

Participants:

 Description   

MongoDB fails to start on an zSeries 12 with rhel72. The same source works perfectly well on a z13 machine. The difference between z12 and z13 is quite big, since the were a lot of change in the zSeries intstruction set.
I assume that this may boils down to an compile issues when the z12 / z13 is an option. I did not check that in detail, I think this option is available with gcc -march=zEC12 vs. z13, there might be more options...



 Comments   
Comment by Andrew Morrow (Inactive) [ 14/Mar/17 ]

Hi Michael -

If you have the opportunity, it would be great if you could confirm that 3.4.3-rc1 runs correctly in your z12 environment, as that would confirm that the issue originally raised here (and in SERVER-27963) has been addressed, before we issue 3.4.3 final. You can download the 3.4.3-rc1 release for RHEL 7.2 s390x here:

https://downloads.mongodb.com/linux/mongodb-linux-s390x-enterprise-rhel72-3.4.3-rc1.tgz

Subsequently, when we issue a 3.4 release containing the fix for SERVER-28007 (hopefully 3.4.4), it would be also be much appreciated if you could confirm that that release also continues to run on your z12 hardware, to ensure that the fix for SERVER-28007 doesn't cause a regression and re-introduce SERVER-27963. We are unfortunately not in a good position to test that ourselves since all our z390x systems are z13.

I hope your performance investigations re MongoDB on zSeries are going well.

Comment by Michael Höller [ 13/Mar/17 ]

Hi, no worries - I realized that. I still owe you a test. Unfortunately I am kind of "locked" in Think Tank the last and next week.
If a test with a nightly build is really adding value, please let me known and I will see that I get some time and a z12 LPAR and run a test.

All the best
Michael

Comment by Andrew Morrow (Inactive) [ 13/Mar/17 ]

Michael - Apologies, I got that wrong. I expect the fix for SERVER-28007 will arrive in 3.4.4, because 3.4.3 is already in RC.

Comment by Andrew Morrow (Inactive) [ 13/Mar/17 ]

Hi Michael - Yes, that is the ticket to watch. I would expect that the fix will arrive in the MongoDB 3.4.3 release (3.4.2 is not yet out, but I don't think the fix will make that release as it has already gone to a release candidate).

Comment by Michael Höller [ 04/Mar/17 ]

Hello Andrew,

I am fine to close this ticket when you have already one. Can you please provide me the link to this one?
I found https://jira.mongodb.org/browse/SERVER-28007 but I am not sure if this is the one you mean.

I will do some tests and provide information as far as I can since I am under a NDA...

Thanks a lot
Michael

Comment by Andrew Morrow (Inactive) [ 28/Feb/17 ]

Michael - I'm going to close this ticket as a duplicate, since we have other tickets that are tracking this work. Please feel free to re-open it if you have any additional comments, or if the nightly build that I've provided doesn't solve the issue for you. If you find interesting things in your performance testing, please feel free to open new tickets.

Comment by Andrew Morrow (Inactive) [ 24/Feb/17 ]

Hi Michael -

You can download a nightly build of the s390x RLEH 7.2 build here: https://s3.amazonaws.com/mciuploads/mongodb-mongo-v3.4/enterprise-rhel-72-s390x/131e03e5dc4fa94d44f600d2844b470ede4f1d4e/binaries/mongo-mongodb_mongo_v3.4_enterprise_rhel_72_s390x_131e03e5dc4fa94d44f600d2844b470ede4f1d4e_17_02_17_14_50_40.tgz

Of course, the usual caveats apply - this is a nightly build so is not suitable for production use, etc.

Note that if you are doing performance testing and your benchmark system is not a z13 system, this or the upcoming 3.4.3 release should be fine to use for benchmarking, because it never could have taken advantage of the VX extensions for CRC32C in the first place.

We are happy to help, and if you have any interesting results from your performance testing, please let us know.

Comment by Michael Höller [ 24/Feb/17 ]

Hello Andrew,
thanks for the quick answer. Actually I do Performance testing, so I will most likely skip the version 3.4.3. However I am happy to help and do some testing. I will be out of the office until March 6th. So if you can wait until the 6th / 7th I can do some testing. Would be great if you could mail me instructions to obtain the nightly before the 6th so that I can start working on it my morning (I am on CET)..

Thanks a lot
Michael

Comment by Andrew Morrow (Inactive) [ 24/Feb/17 ]

Hi -

This is a known issue. The RHEL 7.2 build of MongoDB 3.4 is intended to run on z12, but there is bug that introduces a hard dependency on the z13 only VX instructions, which we use for hardware accelerated CRC32C support in the WiredTiger storage engine.

To work around this issue, we have temporarily disabled hardware CRC32 support in all zSeries builds in SERVER-27963. That change will be available in the upcoming MongoDB 3.4.3 release, which should run on z12 systems. We will later re-introduce software detection for the availability of VX instructions in SERVER-28007, so that VX capable hardware (z13, etc) offers the best performance.

If you are doing testing or qualification and a non-production build would be useful for you, we can provide instructions on how to obtain a nightly build of the v3.4 branch that contains the fix that will be delivered in the upcoming 3.4.3 release. If possible, we would also appreciate your testing the build to ensure that it works around the bug on your affected system.

Generated at Thu Feb 08 04:17:06 UTC 2024 using Jira 9.7.1#970001-sha1:2222b88b221c4928ef0de3161136cc90c8356a66.