[SERVER-3913] MongoDB 2.0 crashes on Windows 7 x64 Created: 20/Sep/11  Updated: 11/Jul/16  Resolved: 23/Sep/11

Status: Closed
Project: Core Server
Component/s: Index Maintenance, Testing Infrastructure
Affects Version/s: 2.0.0
Fix Version/s: 2.0.1, 2.1.0

Type: Bug Priority: Critical - P2
Reporter: Alexey Borzenkov Assignee: Sridhar Nanjundeswaran
Resolution: Done Votes: 0
Labels: None
Remaining Estimate: Not Specified
Time Spent: Not Specified
Original Estimate: Not Specified
Environment:

Windows 7 x64


Attachments: File SERVER3913.js    
Issue Links:
Duplicate
is duplicated by SERVER-4503 Behave better in filesmd5 command whe... Closed
Related
Operating System: Windows
Participants:

 Description   

I have a big application that uses pymongo and when I tried out the new MongoDB 2.0 I found that MongoDB server crashes during tests (which are mostly complex integration/lifecycle tests with high concurrency). The crash is consistent is that it always happens during tests, but not always at the same test. This happens on my development Windows 7 x64 machine (with 64-bit MongoDB server), unfortunately I haven't had the time to test it on Linux (that is used in production), these crashes kind of halted the whole idea of an upgrade at the moment.

I finally managed to look at it a little closer yesterday, and here's what gets written on the console at the time of the crash:

Mon Sep 19 16:56:00 [initandlisten] connection accepted from 127.0.0.1:55254 #6
Mon Sep 19 16:56:00 [conn6] dropDatabase app-tests-db
Mon Sep 19 16:56:01 [conn6] removeJournalFiles
Mon Sep 19 16:56:01 [conn6] command app-tests-db.$cmd command:

{ dropDatabase: 1 }

ntoreturn:1 reslen:70 339ms
Mon Sep 19 16:56:01 [conn5] build index app-tests-db.fs.chunks

{ _id: 1 }

Mon Sep 19 16:56:01 [conn5] build index done 0 records 0.001 secs
Mon Sep 19 16:56:01 [conn5] warning: best guess query plan requested, but scan and order are required for all plans query:

{ files_id: ObjectId('4e773be038d36e1210000094') }

order:

{ files_id: 1, n: 1 }

choices:

{ $natural: 1 }

unhandled windows exMon Sep 19 16:56:01 access violation
Mon Sep 19 16:56:18 [initandlisten] connection accepted from 127.0.0.1:55258 #7

The warning is GridFS related, but I'm not sure how relevant it is. Unfortunately minidump doesn't really help, because I don't have .pdb files for the server, and the only exported symbols are pcre-related, so stack trace is pretty useless:

0:011> .ecxr
rax=0000000003bfde60 rbx=000000000044eba0 rcx=0000000000000000
rdx=00000000004b2640 rsi=0000000000000000 rdi=00000000004a48b0
rip=000000013f8e1871 rsp=0000000003bfdd70 rbp=0000000003bfde70
r8=00000000007a9240 r9=00000000004b2640 r10=0000000003bfdd78
r11=00000000004b2640 r12=0000000000000000 r13=ffffffffffffffff
r14=0000000000000010 r15=0000000003bfe0d0
iopl=0 nv up ei pl nz na po nc
cs=0033 ss=002b ds=002b es=002b fs=0053 gs=002b efl=00010206

      • WARNING: Unable to verify checksum for mongod.exe
      • ERROR: Symbol file could not be found. Defaulted to export symbols for mongod.exe -
        mongod!pcrecpp::Arg::parse_ulonglong_cradix+0x74a71:
        00000001`3f8e1871 488b01 mov rax,qword ptr [rcx] ds:00000000`00000000=????????????????
        [...]
        0:011> !analyze -v
        [...]
        STACK_TEXT:
        00000000`03bfdd70 00000001`3f9c66d7 : 00000000`00000004 00000000`00000004 00000000`03bfdfb8 00000000`03bfe6e0 : mongod!pcrecpp::Arg::parse_ulonglong_cradix+0x74a71
        00000000`03bfdee0 00000001`3f9cd397 : 00000000`00000001 00000000`00000001 00000000`03bfe6e0 00000000`03bfecb0 : mongod!pcrecpp::Arg::parse_ulonglong_cradix+0x1598d7
        00000000`03bfe2e0 00000001`3f9cd913 : 00000000`004ad640 00000000`004b45a5 00000000`03bfe750 00000000`004b6fa0 : mongod!pcrecpp::Arg::parse_ulonglong_cradix+0x160597
        00000000`03bfe5e0 00000001`3f9a669e : 00000000`00000204 00000000`004b3100 00000000`00000000 00000001`3fa615bf : mongod!pcrecpp::Arg::parse_ulonglong_cradix+0x160b13
        00000000`03bfe800 00000001`3f9aa7f8 : 00000000`03bfecb0 00000000`00000000 00000000`004a95c0 00000000`03bfea98 : mongod!pcrecpp::Arg::parse_ulonglong_cradix+0x13989e
        00000000`03bfe890 00000001`3f94f6d3 : 00000000`03bff250 ffffffff`fffffffe 000077f4`f8e4395f 00000000`007a0230 : mongod!pcrecpp::Arg::parse_ulonglong_cradix+0x13d9f8
        00000000`03bff1d0 00000001`3f94fbb2 : 00000000`00000000 00000000`03bff769 00000000`03bff708 0000037f`00000003 : mongod!pcrecpp::Arg::parse_ulonglong_cradix+0xe28d3
        00000000`03bff400 00000001`3fa20990 : 00000000`00000000 00000000`004b4570 00000000`004a3e20 00000000`00000000 : mongod!pcrecpp::Arg::parse_ulonglong_cradix+0xe2db2
        00000000`03bff6d0 00000001`3f87e6e8 : 00000000`004a3e20 00000000`03bff848 00000000`004a3e20 00000000`00000000 : mongod!pcrecpp::Arg::parse_ulonglong_cradix+0x1b3b90
        00000000`03bff7d0 00000001`3fa4d5d1 : 00000000`00362320 00000000`00362320 00000000`00000000 00000000`00000000 : mongod!pcrecpp::Arg::parse_ulonglong_cradix+0x118e8
        00000000`03bff9c0 00000001`3fa6e5eb : 00000000`00000000 00000000`00000000 00000000`00000000 00000000`00000000 : mongod!pcrecpp::Arg::parse_ulonglong_cradix+0x1e07d1
        00000000`03bffa00 00000001`3fa6e67f : 00000000`00000000 00000000`00000000 00000000`00000000 00000000`00000000 : mongod!pcrecpp::Arg::parse_ulonglong_cradix+0x2017eb
        00000000`03bffa30 00000000`7757652d : 00000000`00000000 00000000`00000000 00000000`00000000 00000000`00000000 : mongod!pcrecpp::Arg::parse_ulonglong_cradix+0x20187f
        00000000`03bffa60 00000000`77b1c521 : 00000000`00000000 00000000`00000000 00000000`00000000 00000000`00000000 : kernel32!BaseThreadInitThunk+0xd
        00000000`03bffa90 00000000`00000000 : 00000000`00000000 00000000`00000000 00000000`00000000 00000000`00000000 : ntdll!RtlUserThreadStart+0x1d

Is there a way to get pdb files for the build, or maybe some other way to help you figure out what's crashing so hard?



 Comments   
Comment by auto [ 23/Sep/11 ]

Author:

{u'login': u'erh', u'name': u'Eliot Horowitz', u'email': u'eliot@10gen.com'}

Message: fix crash in filemd5 without correct index SERVER-3913
Branch: v2.0
https://github.com/mongodb/mongo/commit/2250ff90c53e486a21ee1f3f2a9de71fdd7743ca

Comment by auto [ 23/Sep/11 ]

Author:

{u'login': u'erh', u'name': u'Eliot Horowitz', u'email': u'eliot@10gen.com'}

Message: fix crash in filemd5 without correct index SERVER-3913
Branch: master
https://github.com/mongodb/mongo/commit/68773c1816e297500c0727ea8037e5377550d171

Comment by Sridhar Nanjundeswaran [ 22/Sep/11 ]

When the attached script is executed against 1.8.3 it throws the error "best guess plan requested, but scan and order required: query:

{ files_id: 1.0 }

order:

{ files_id: 1, n: 1 }

choices:

{ $natural: 1 }

". The database does not crash.
When run against 2.0 it crashes the server.
Note: If the index creation is uncommented it no longer crashes in either.

Comment by Sridhar Nanjundeswaran [ 20/Sep/11 ]

Would you be able to send us the test suite you are using to cause this. Also what options are you using to start mongod on your local test environment. Finally what version of pymongo are you using

Generated at Thu Feb 08 03:04:24 UTC 2024 using Jira 9.7.1#970001-sha1:2222b88b221c4928ef0de3161136cc90c8356a66.