[SERVER-16180] Crash running a single node WT replset Created: 17/Nov/14  Updated: 28/Nov/14  Resolved: 18/Nov/14

Status: Closed
Project: Core Server
Component/s: Replication, Stability
Affects Version/s: 2.8.0-rc0
Fix Version/s: None

Type: Bug Priority: Major - P3
Reporter: Alvin Richards (Inactive) Assignee: Unassigned
Resolution: Cannot Reproduce Votes: 0
Labels: 28qa
Remaining Estimate: Not Specified
Time Spent: Not Specified
Original Estimate: Not Specified

Issue Links:
Depends
Related
related to SERVER-16345 WiredTiger primary died with: Fatal D... Closed
Backwards Compatibility: Fully Compatible
Operating System: ALL
Participants:

 Description   

Problem

Running WT in a single node ReplSet, got following trace

Mixed.v0.FineThenUpdate-50-50
1       5287.041538583684
2       9984.94758652438
4       16071.774246445602
8       19160.066759706086
12      31933.42156298285
16      29805.830193150017
2014-11-16T18:14:35.735-0500 I NETWORK  Socket recv() errno:104 Connection reset by peer 127.0.0.1:27017
...
2014-11-16T18:14:35.738-0500 I NETWORK  DBClientCursor::init call() failed
2014-11-16T18:14:35.740-0500 F -        Got signal: 11 (Segmentation fault).
 
0x812b49 0x812702 0x812a2e 0x7ffff7bcf130 0x77fcd7 0x78481d 0x85f1d4 0x7ffff7bc7df3 0x7ffff6cca01d
----- BEGIN BACKTRACE -----
{"backtrace":[{"b":"400000","o":"412B49"},{"b":"400000","o":"412702"},{"b":"400000","o":"412A2E"},{"b":"7FFFF7BC0000","o":"F130"},{"b":"400000","o":"37FCD7"},{"b":"400000","o":"38481D"},{"b":"400000","o":"45F1D4"},{"b":"7FFFF7BC0000","o":"7DF3"},{"b":"7FFFF6BD4000","o":"F601D"}],"processInfo":{ "mongodbVersion" : "2.8.0-rc1-pre-", "gitVersion" : "d12f3728b725615cb62b89396efbd3c8c059524f", "uname" : { "sysname" : "Linux", "release" : "3.10.0-123.el7.x86_64", "version" : "#1 SMP Mon May 5 11:16:57 EDT 2014", "machine" : "x86_64" }, "somap" : [ { "elfType" : 2, "b" : "400000", "buildId" : "CEF9A005EA63443EE4E5F8032F12F0739B8CC4D6" }, { "b" : "7FFFF88FA000", "elfType" : 3, "buildId" : "EBD9FBF2265129CEAB3866D40C826C9629F08CD0" }, { "b" : "7FFFF7BC0000", "path" : "/lib64/libpthread.so.0", "elfType" : 3, "buildId" : "18562EE0363BC9BD7101610BD86469AA426D0C44" }, { "b" : "7FFFF79B8000", "path" : "/lib64/librt.so.1", "elfType" : 3, "buildId" : "8832E3070AB0758762836EEC8FCDDEDEF8235340" }, { "b" : "7FFFF77B4000", "path" : "/lib64/libdl.so.2", "elfType" : 3, "buildId" : "B7C4BC0854BF5DE16B535353B38235CA42349C1E" }, { "b" : "7FFFF74AD000", "path" : "/lib64/libstdc++.so.6", "elfType" : 3, "buildId" : "63C62D6263FF98E6DD6896CB3E716E499744A4C9" }, { "b" : "7FFFF71AB000", "path" : "/lib64/libm.so.6", "elfType" : 3, "buildId" : "D70EAB176DDA46DE292FEB8208A0E8A6718BAF3B" }, { "b" : "7FFFF6F95000", "path" : "/lib64/libgcc_s.so.1", "elfType" : 3, "buildId" : "07120A9AC1BF3BCDD4A3EA1E0C47234A4A5C84F9" }, { "b" : "7FFFF6BD4000", "path" : "/lib64/libc.so.6", "elfType" : 3, "buildId" : "CFF370844D00EA5451D7ADD439646A93C64D48A5" }, { "b" : "7FFFF7DDC000", "path" : "/lib64/ld-linux-x86-64.so.2", "elfType" : 3, "buildId" : "4EADCA6CB82E0A85EDB87C15B5E3980742514501" } ] }}
mongo(_ZN5mongo15printStackTraceERSo+0x29) [0x812b49]
mongo(+0x412702) [0x812702]
mongo(+0x412A2E) [0x812a2e]
libpthread.so.0(+0xF130) [0x7ffff7bcf130]
mongo(_ZN5mongo14BenchRunWorker24generateLoadOnConnectionEPNS_12DBClientBaseE+0x1F27) [0x77fcd7]
mongo(_ZN5mongo14BenchRunWorker3runEv+0xED) [0x78481d]
mongo(+0x45F1D4) [0x85f1d4]
libpthread.so.0(+0x7DF3) [0x7ffff7bc7df3]
libc.so.6(clone+0x6D) [0x7ffff6cca01d]
-----  END BACKTRACE  -----

Reproduce

Reproduces with this following

/home/ec2-user/mongodb-linux-x86_64-2.8.0-rc0/bin/mongod --port 27017 --dbpath /data2/db/db100 --logpath /data3/db/db100/server.log --fork --storageEngine=wiredtiger --wiredTigerEngineConfig 'checkpoint=(wait=14400)' --master --oplogSize 500
 
shell> mongoPerfRunTests([1, 2, 4, 8, 12, 16, 20], 1, 1, 5, 1, 'sanity-2.8.0-rc0-wiredtiger-single', 'sanity', '54.191.70.12', '27017', '2014-11-16 19:53:57.535852', 0, {"writeCmdMode": "true", "writeConcernW": 0, "safeGLE": "false", "writeConcernJ": "false"}, {"server_git_hash": "b6c4e2491c1442b05a160acda0d78001f56a2ade", "server_storage_engine": "wiredtiger", "harness": {"git_hash": "unknown", "client": {"git_hash": "d12f3728b725615cb62b89396efbd3c8c059524f", "version": "2.8.0-rc1-pre-", "name": "mongo shell"}, "name": "mongo-perf", "version": "unknown"}, "server_git_commit_date": "2014-11-11 16:44:18", "server_version": "2.8.0-rc0"});



 Comments   
Comment by Eric Milkie [ 17/Nov/14 ]

Over last week, we've made some commits (post RC0) that have improved the handling of WriteConflicts when writing to the oplog. Can you try the reproducer using a later build?

Comment by Daniel Pasette (Inactive) [ 17/Nov/14 ]

the mongo-perf cmd should be
python benchrun.py -f testcases/mixed_small.js --mongo-repo-path ~/code/mongo --nodyno --threads 1 4 8 16 20

Comment by Alvin Richards (Inactive) [ 17/Nov/14 ]

This is what was last recorded in the server log

1 keyUpdates:0 numYields:0  reslen:80 612ms
2014-11-16T21:56:50.647-0500 I WRITES   [conn978] insert test0.Insert_JustID0 query: { _id: ObjectId('546963f103348d37c69224b6') } ninserted:1 keyUpdates:0 numYields:0  787ms
2014-11-16T21:56:50.647-0500 I QUERY    [conn978] command test0.$cmd command: insert { insert: "Insert_JustID0", documents: [ { _id: ObjectId('546963f103348d37c69224b6') } ] } ntoreturn:1 keyUpdates:0 numYields:0  reslen:80 787ms
2014-11-16T21:56:52.197-0500 F REPLSETS [conn970] Fatal DBException in logOp(): 112 WriteConflict
2014-11-16T21:56:52.200-0500 F REPLSETS [conn973] Fatal DBException in logOp(): 112 WriteConflict
2014-11-16T21:56:52.201-0500 F REPLSETS [conn982] Fatal DBException in logOp(): 112 WriteConflict
2014-11-16T21:56:52.207-0500 F -        [conn970] terminate() called.
 
 0xf6ffb9 0xf6fd08 0x7ffff750b946 0x7ffff750b973 0xc4a24c 0x99f8aa 0x99fc29 0x9a10b4 0x9a17c5 0x9a3919 0x9bdf84 0x9bee61 0x9bf932 0xbc0e3b 0xaa5483 0x7e82b0 0xf2c691 0x7ffff7bc7df3 0x7ffff6cca01d
----- BEGIN BACKTRACE -----
{"backtrace":[{"b":"400000","o":"B6FFB9"},{"b":"400000","o":"B6FD08"},{"b":"7FFFF74AD000","o":"5E946"},{"b":"7FFFF74AD000","o":"5E973"},{"b":"400000","o":"84A24C"},{"b":"400000","o":"59F8AA"},{"b":"400000","o":"59FC29"},{"b":"400000","o":"5A10B4"},{"b":"400000","o":"5A17C5"},{"b":"400000","o":"5A3919"},{"b":"400000","o":"5BDF84"},{"b":"400000","o":"5BEE61"},{"b":"400000","o":"5BF932"},{"b":"400000","o":"7C0E3B"},{"b":"400000","o":"6A5483"},{"b":"400000","o":"3E82B0"},{"b":"400000","o":"B2C691"},{"b":"7FFFF7BC0000","o":"7DF3"},{"b":"7FFFF6BD4000","o":"F601D"}],"processInfo":{ "mongodbVersion" : "2.8.0-rc0", "gitVersion" : "b6c4e2491c1442b05a160acda0d78001f56a2ade", "uname" : { "sysname" : "Linux", "release" : "3.10.0-123.el7.x86_64", "version" : "#1 SMP Mon May 5 11:16:57 EDT 2014", "machine" : "x86_64" }, "somap" : [ { "elfType" : 2, "b" : "400000" }, { "b" : "7FFFF88FA000", "elfType" : 3 }, { "b" : "7FFFF7BC0000", "path" : "/lib64/libpthread.so.0", "elfType" : 3 }, { "b" : "7FFFF79B8000", "path" : "/lib64/librt.so.1", "elfType" : 3 }, { "b" : "7FFFF77B4000", "path" : "/lib64/libdl.so.2", "elfType" : 3 }, { "b" : "7FFFF74AD000", "path" : "/lib64/libstdc++.so.6", "elfType" : 3 }, { "b" : "7FFFF71AB000", "path" : "/lib64/libm.so.6", "elfType" : 3 }, { "b" : "7FFFF6F95000", "path" : "/lib64/libgcc_s.so.1", "elfType" : 3 }, { "b" : "7FFFF6BD4000", "path" : "/lib64/libc.so.6", "elfType" : 3 }, { "b" : "7FFFF7DDC000", "path" : "/lib64/ld-linux-x86-64.so.2", "elfType" : 3 } ] }}
 mongod(_ZN5mongo15printStackTraceERSo+0x29) [0xf6ffb9]
 mongod(+0xB6FD08) [0xf6fd08]
 libstdc++.so.6(+0x5E946) [0x7ffff750b946]
 libstdc++.so.6(+0x5E973) [0x7ffff750b973]
 mongod(_ZN5mongo4repl5logOpEPNS_16OperationContextEPKcS4_RKNS_7BSONObjEPS5_Pbb+0x1DC) [0xc4a24c]
 mongod(_ZN5mongo18WriteBatchExecutor13execOneInsertEPNS0_16ExecInsertsStateEPPNS_16WriteErrorDetailE+0x3BA) [0x99f8aa]
 mongod(_ZN5mongo18WriteBatchExecutor11execInsertsERKNS_21BatchedCommandRequestEPSt6vectorIPNS_16WriteErrorDetailESaIS6_EE+0x239) [0x99fc29]
 mongod(_ZN5mongo18WriteBatchExecutor11bulkExecuteERKNS_21BatchedCommandRequestEPSt6vectorIPNS_19BatchedUpsertDetailESaIS6_EEPS4_IPNS_16WriteErrorDetailESaISB_EE+0x34) [0x9a10b4]
 mongod(_ZN5mongo18WriteBatchExecutor12executeBatchERKNS_21BatchedCommandRequestEPNS_22BatchedCommandResponseE+0x3A5) [0x9a17c5]
 mongod(_ZN5mongo8WriteCmd3runEPNS_16OperationContextERKSsRNS_7BSONObjEiRSsRNS_14BSONObjBuilderEb+0x169) [0x9a3919]
 mongod(_ZN5mongo12_execCommandEPNS_16OperationContextEPNS_7CommandERKSsRNS_7BSONObjEiRSsRNS_14BSONObjBuilderEb+0x34) [0x9bdf84]
 mongod(_ZN5mongo7Command11execCommandEPNS_16OperationContextEPS0_iPKcRNS_7BSONObjERNS_14BSONObjBuilderEb+0xCA1) [0x9bee61]
 mongod(_ZN5mongo12_runCommandsEPNS_16OperationContextEPKcRNS_7BSONObjERNS_11_BufBuilderINS_16TrivialAllocatorEEERNS_14BSONObjBuilderEbi+0x222) [0x9bf932]
 mongod(_ZN5mongo11newRunQueryEPNS_16OperationContextERNS_7MessageERNS_12QueryMessageERNS_5CurOpES3_b+0x105B) [0xbc0e3b]
 mongod(_ZN5mongo16assembleResponseEPNS_16OperationContextERNS_7MessageERNS_10DbResponseERKNS_11HostAndPortEb+0xBC3) [0xaa5483]
 mongod(_ZN5mongo16MyMessageHandler7processERNS_7MessageEPNS_21AbstractMessagingPortEPNS_9LastErrorE+0xE0) [0x7e82b0]
 mongod(_ZN5mongo17PortMessageServer17handleIncomingMsgEPv+0x421) [0xf2c691]
 libpthread.so.0(+0x7DF3) [0x7ffff7bc7df3]
 libc.so.6(clone+0x6D) [0x7ffff6cca01d]

Comment by Scott Hernandez (Inactive) [ 17/Nov/14 ]

This looks like just a shell error, are there server logs indicating a problem there? Did the server error or fail in any way? If not this seems unrelated to wiredtiger or replication but simply a client issue.

Generated at Thu Feb 08 03:40:13 UTC 2024 using Jira 9.7.1#970001-sha1:2222b88b221c4928ef0de3161136cc90c8356a66.