-
Type:
Bug
-
Resolution: Cannot Reproduce
-
Priority:
Critical - P2
-
None
-
Affects Version/s: 1.6.5
-
Component/s: Replication
-
None
-
Environment:Xen DomU Guest running Debian Squeeze (kernel 2.6.32-5-xen-amd64)
MongoDB 1.6.5 downloaded from the website as binary
Configured mongo with replSet
-
Linux
Mongo is running fine and when I do
kill $(cat /mongo/db/mongod.lock)
It sometimes (1 out of 4 ) seems to cause a kernel panic.
I did some testing and it only seems to occur when adding the mongo to a replica set cluster with the replSet option.
This is the stack trace:
[58625.873310] alignment check: 0000 1 SMP
[58625.873317] last sysfs file: /sys/devices/virtual/net/lo/operstate
[58625.873320] CPU 0
[58625.873323] Modules linked in: snd_pcm snd_timer snd soundcore snd_page_alloc pcspkr evdev xfs exportfs xen_netfront xen_blkfront
[58625.873336] Pid: 8539, comm: mongod Not tainted 2.6.32-5-xen-amd64 #1
[58625.873339] RIP: e030:[<ffffffff81270c0b>] [<ffffffff81270c0b>] eth_type_trans+0x3d/0xae
[58625.873347] RSP: e02b:ffff880001c93988 EFLAGS: 00050246
[58625.873350] RAX: ffff88002efd20fc RBX: ffff88002e3b12e8 RCX: ffff88002efd20ee
[58625.873354] RDX: 0000000000000042 RSI: 000000000000000e RDI: ffff88002e3b12e8
[58625.873357] RBP: ffff88002fc3e800 R08: 0000000000000000 R09: 0000000000000000
[58625.873361] R10: 000000000000000e R11: ffffffff8125fbaf R12: ffff88002e3a2080
[58625.873364] R13: ffff88002fc3e800 R14: ffff88002fdea980 R15: ffffffff81350270
[58625.873371] FS: 00007ff239953710(0000) GS:ffff8800031ac000(0000) knlGS:0000000000000000
[58625.873375] CS: e033 DS: 0000 ES: 0000 CR0: 000000008005003b
[58625.873378] CR2: 000000000080a45c CR3: 0000000001001000 CR4: 0000000000002660
[58625.873382] DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000
[58625.873385] DR3: 0000000000000000 DR6: 00000000ffff0ff0 DR7: 0000000000000400
[58625.873389] Process mongod (pid: 8539, threadinfo ffff880001c92000, task ffff88002eab2350)
[58625.873392] Stack:
[58625.873394] 0000000000000000 ffff88002fc3e800 ffff88002e3b12e8 ffffffff812398d0
[58625.873399] <0> 0000000000000000 ffff88002e3b12e8 ffff88002e3a2080 ffffffff8125f9e4
[58625.873407] <0> ffffffff8100ecdf 0000000000000000 ffff88002fdea980 ffff88002e3a2080
[58625.873414] Call Trace:
[58625.873418] [<ffffffff812398d0>] ? loopback_xmit+0x36/0x7a
[58625.873422] [<ffffffff8125f9e4>] ? dev_hard_start_xmit+0x211/0x2db
[58625.873428] [<ffffffff8100ecdf>] ? xen_restore_fl_direct_end+0x0/0x1
[58625.873432] [<ffffffff8125fe8c>] ? dev_queue_xmit+0x2dd/0x38d
[58625.873437] [<ffffffff81287483>] ? ip_queue_xmit+0x311/0x386
[58625.873487] [<ffffffffa004744d>] ? xfs_log_release_iclog+0x10/0x38 [xfs]
[58625.873498] [<ffffffffa00515f5>] ? _xfs_trans_commit+0x25f/0x2d1 [xfs]
[58625.873502] [<ffffffff8100e63d>] ? xen_force_evtchn_callback+0x9/0xa
[58625.873507] [<ffffffff81297e33>] ? tcp_transmit_skb+0x648/0x687
[58625.873511] [<ffffffff8100ecf2>] ? check_events+0x12/0x20
[58625.873515] [<ffffffff8129a2b5>] ? tcp_write_xmit+0x874/0x96c
[58625.873518] [<ffffffff8129a3fa>] ? __tcp_push_pending_frames+0x22/0x53
[58625.873523] [<ffffffff8128d7fd>] ? tcp_close+0x176/0x3d0
[58625.873528] [<ffffffff812aa2f8>] ? inet_release+0x4e/0x54
[58625.873533] [<ffffffff81251121>] ? sock_release+0x19/0x66
[58625.873536] [<ffffffff81251190>] ? sock_close+0x22/0x26
[58625.873541] [<ffffffff810f09c9>] ? __fput+0x100/0x1af
[58625.873545] [<ffffffff810ede06>] ? filp_close+0x5b/0x62
[58625.873549] [<ffffffff810508a0>] ? put_files_struct+0x64/0xc1
[58625.873553] [<ffffffff8105215d>] ? do_exit+0x22e/0x6c6
[58625.873557] [<ffffffff81052165>] ? do_exit+0x236/0x6c6
[58625.873560] [<ffffffff8105266b>] ? do_group_exit+0x76/0x9d
[58625.873565] [<ffffffff8105eef7>] ? get_signal_to_deliver+0x310/0x339
[58625.873570] [<ffffffff8101104f>] ? do_notify_resume+0x87/0x73f
[58625.873573] [<ffffffff8100b444>] ? xen_write_msr_safe+0x76/0xb1
[58625.873577] [<ffffffff810106c4>] ? __switch_to+0x1ad/0x297
[58625.873582] [<ffffffff81049045>] ? finish_task_switch+0x44/0xaf
[58625.873586] [<ffffffff81011e0e>] ? int_signal+0x12/0x17
[58625.873588] Code: 87 d8 00 00 00 2b 87 d0 00 00 00 be 0e 00 00 00 89 87 c4 00 00 00 e8 68 48 fe ff 8b 8b c4 00 00 00 48 03 8b d0 00 00 00 f6 01 01 <48> 8b 11 74 20 48 33 95 40 02 00 00 8a 43 7d 48 c1 e2 10 75 08
[58625.873630] RIP [<ffffffff81270c0b>] eth_type_trans+0x3d/0xae
[58625.873634] RSP <ffff880001c93988>
[58625.873639] --[ end trace f73fe61a27c51fab ]--
[58625.873641] Kernel panic - not syncing: Fatal exception in interrupt
[58625.873645] Pid: 8539, comm: mongod Tainted: G D 2.6.32-5-xen-amd64 #1
[58625.873648] Call Trace:
[58625.873652] [<ffffffff8130ac81>] ? panic+0x86/0x143
[58625.873657] [<ffffffff8130cb3a>] ? _spin_unlock_irqrestore+0xd/0xe
[58625.873661] [<ffffffff8100ecdf>] ? xen_restore_fl_direct_end+0x0/0x1
[58625.873664] [<ffffffff8130cb3a>] ? _spin_unlock_irqrestore+0xd/0xe
[58625.873668] [<ffffffff8104f3af>] ? release_console_sem+0x17e/0x1af
[58625.873672] [<ffffffff8130d9d5>] ? oops_end+0xa7/0xb4
[58625.873676] [<ffffffff81013416>] ? do_alignment_check+0x88/0x92
[58625.873680] [<ffffffff8125fbaf>] ? dev_queue_xmit+0x0/0x38d
[58625.873685] [<ffffffff811f1976>] ? HYPERVISOR_event_channel_op+0x11/0x50
[58625.873695] [<ffffffffa004d6f9>] ? xfs_icsb_modify_counters+0x7b/0x1a0 [xfs]
[58625.873699] [<ffffffff81012a75>] ? alignment_check+0x25/0x30
[58625.873703] [<ffffffff8125fbaf>] ? dev_queue_xmit+0x0/0x38d
[58625.873706] [<ffffffff81270c0b>] ? eth_type_trans+0x3d/0xae
[58625.873710] [<ffffffff81270bfb>] ? eth_type_trans+0x2d/0xae
[58625.873713] [<ffffffff812398d0>] ? loopback_xmit+0x36/0x7a
[58625.873717] [<ffffffff8125f9e4>] ? dev_hard_start_xmit+0x211/0x2db
[58625.873721] [<ffffffff8100ecdf>] ? xen_restore_fl_direct_end+0x0/0x1
[58625.873724] [<ffffffff8125fe8c>] ? dev_queue_xmit+0x2dd/0x38d
[58625.873728] [<ffffffff81287483>] ? ip_queue_xmit+0x311/0x386
[58625.873738] [<ffffffffa004744d>] ? xfs_log_release_iclog+0x10/0x38 [xfs]
[58625.873747] [<ffffffffa00515f5>] ? _xfs_trans_commit+0x25f/0x2d1 [xfs]
[58625.873752] [<ffffffff8100e63d>] ? xen_force_evtchn_callback+0x9/0xa
[58625.873755] [<ffffffff81297e33>] ? tcp_transmit_skb+0x648/0x687
[58625.873759] [<ffffffff8100ecf2>] ? check_events+0x12/0x20
[58625.873762] [<ffffffff8129a2b5>] ? tcp_write_xmit+0x874/0x96c
[58625.873766] [<ffffffff8129a3fa>] ? __tcp_push_pending_frames+0x22/0x53
[58625.873770] [<ffffffff8128d7fd>] ? tcp_close+0x176/0x3d0
[58625.873773] [<ffffffff812aa2f8>] ? inet_release+0x4e/0x54
[58625.873777] [<ffffffff81251121>] ? sock_release+0x19/0x66
[58625.873780] [<ffffffff81251190>] ? sock_close+0x22/0x26
[58625.873784] [<ffffffff810f09c9>] ? __fput+0x100/0x1af
[58625.873787] [<ffffffff810ede06>] ? filp_close+0x5b/0x62
[58625.873791] [<ffffffff810508a0>] ? put_files_struct+0x64/0xc1
[58625.873794] [<ffffffff8105215d>] ? do_exit+0x22e/0x6c6
[58625.873797] [<ffffffff81052165>] ? do_exit+0x236/0x6c6
[58625.873801] [<ffffffff8105266b>] ? do_group_exit+0x76/0x9d
[58625.873804] [<ffffffff8105eef7>] ? get_signal_to_deliver+0x310/0x339
[58625.873808] [<ffffffff8101104f>] ? do_notify_resume+0x87/0x73f
[58625.873812] [<ffffffff8100b444>] ? xen_write_msr_safe+0x76/0xb1
[58625.873815] [<ffffffff810106c4>] ? __switch_to+0x1ad/0x297
[58625.873819] [<ffffffff81049045>] ? finish_task_switch+0x44/0xaf
[58625.873822] [<ffffffff81011e0e>] ? int_signal+0x12/0x17