Details
-
Bug
-
Resolution: Cannot Reproduce
-
Major - P3
-
None
-
1.3.3
-
None
-
None
-
AMD 2352
62G RAM
CentOS 5.4
Description
Slave database locks up. Pasting email thread so far.
On Sun, Mar 7, 2010 at 8:45 PM, Ianiv Schweber <ischweber@nowpublic.com> wrote:
iostat shows nothing much is happening, mostly 0s all the time.
top now shows
13084 mongodb 18 0 8656m 7.1g 6.8g T 0.0 11.3 460:31.78 mongod
I'm pretty sure status was S before running gdb. That scrolled out of screen's buffer so I can't double check anymore.
Would you like me to open a new ticket and paste the conversation so far?
Ianiv Schweber
Software Developer
NowPublic.com
Public Key: http://www.blogaholics.ca/ianivpubkey.asc
On 2010-03-07, at 5:26 PM, Eliot Horowitz wrote:
Just added you to commercial support on jira.
is anything happening in top? what about "iostat -x 2"
On Sun, Mar 7, 2010 at 8:16 PM, Ianiv Schweber <ischweber@nowpublic.com> wrote:
I don't have access to the commercial support section yet. I do have an account at jira.mongodb.org with username ianiv.
I can't even start the JS console. It prints this and then just sits there.
[root@mongo2 ~]# mongo scan_stats
MongoDB shell version: 1.3.3
url: scan_stats
connecting to: scan_stats
Likewise, the web console doesn't repsond.
ctrl-c on gdb does nothing. The output of the gdb session is
[root@mongo2 mongodb]# gdb bin/mongod --pid 13084
<snip gdb version info>
(no debugging symbols found)
Attaching to program: /usr/local/mongodb/bin/mongod, process 13084
Reading symbols from /lib64/libpthread.so.0...(no debugging symbols found)...done.
[Thread debugging using libthread_db enabled]
[New Thread 0x2b3bb0eddc40 (LWP 13084)]
[New Thread 0x4425b940 (LWP 13091)]
[New Thread 0x4385a940 (LWP 13090)]
[New Thread 0x42e59940 (LWP 13089)]
Ianiv Schweber
Software Developer
NowPublic.com
Public Key: http://www.blogaholics.ca/ianivpubkey.asc
On 2010-03-07, at 4:49 PM, Eliot Horowitz wrote:
Can you connect with a shell and do db.currentOp()
Can you view the web console?
in gdb, what happens if you hit ctrl-c?
do you have access to the commercial support section of jira yet?
would be great if we could move discussion there so its easier to track.
On Sun, Mar 7, 2010 at 7:26 PM, Ianiv Schweber <ischweber@nowpublic.com> wrote:
I attached gdb to the running mongod but I cannot interrupt it to run the
commands.
[root@mongo2 ~]# mount
/dev/sda5 on / type ext3 (rw,noatime)
proc on /proc type proc (rw)
sysfs on /sys type sysfs (rw)
devpts on /dev/pts type devpts (rw,gid=5,mode=620)
/dev/sda3 on /tmp type ext3 (rw)
/dev/sda1 on /boot type ext3 (rw)
tmpfs on /dev/shm type tmpfs (rw)
none on /proc/sys/fs/binfmt_misc type binfmt_misc (rw)
sunrpc on /var/lib/nfs/rpc_pipefs type rpc_pipefs (rw)
nfsd on /proc/fs/nfsd type nfsd (rw)
[root@mongo2 ~]# df -h
Filesystem Size Used Avail Use% Mounted on
/dev/sda5 807G 406G 359G 54% /
/dev/sda3 2.0G 36M 1.9G 2% /tmp
/dev/sda1 251M 22M 216M 10% /boot
tmpfs 32G 0 32G 0% /dev/shm
Ianiv Schweber
Software Developer
NowPublic.com
Public Key: http://www.blogaholics.ca/ianivpubkey.asc
On 2010-03-07, at 11:16 AM, Dwight Merriman wrote:
hi,
can you do the following with gdb:
$ gdb mongod --pid=<slave process mongod pid>
(gdb) bt
(gdb) info threads
then also can you run
mount
df -h
and email the above?
thanks
On Sun, Mar 7, 2010 at 1:28 PM, Ianiv Schweber <ischweber@nowpublic.com>
wrote:
Hi Dwight,
It looks like the slave locked up again, 1 hour 35 minutes ago. Last lines
in log:
Sun Mar 7 17:05:36 repl: applied 269 operations
Sun Mar 7 17:05:36 repl: end sync_pullOpLog syncedTo: Sun Mar 7 16:52:18
2010 4b93d9c2:26
Sun Mar 7 17:05:38 allocating new datafile
/usr/local/mongodb/data/scan_stats.8, filling with zeroes...
Just like before.
I'll keep it in this state for now in case it helps you figure out what is
going on. If not, let me know and I'll restart it.
Thanks,
Ianiv Schweber
Software Developer
NowPublic.com
Public Key: http://www.blogaholics.ca/ianivpubkey.asc
On 2010-03-04, at 1:26 PM, Narayan Newton wrote:
Ianiv will chime in here soon, but:
AMD 2352
62G RAM
CentOS 5.4
Mongo 1.3.3-64bit builds from mongodb.org
On Thu, Mar 4, 2010 at 1:18 PM, Dwight Merriman <dwight@10gen.com>
wrote:
ok.
what is your OS platform and rough versions of mongo?
On Thu, Mar 4, 2010 at 4:16 PM, Narayan Newton
<nnewton@tag1consulting.com>
wrote:
Nice to meet you Dwight,
We have had two issues recently, I'm currently investigating one (a
slave hung on filling a data file with 0s and appeared to deadlock)
and Ianiv has been working on the other, a corruption issue we have
had on multiple versions causing exceptions in the log.
We have "started over" with a new DB and the exceptions have not
recurred so far, however, we have kept the old data files around to
figure out what went wrong there. I've added Ianiv to the thread and
he can explain this issue in more detail.
Thanks!
-N