[SERVER-4728] Repeated crashes during normal server operations. Created: 19/Jan/12  Updated: 11/Jul/16  Resolved: 21/Jan/12

Status: Closed
Project: Core Server
Component/s: Stability
Affects Version/s: 2.0.2
Fix Version/s: None

Type: Bug Priority: Major - P3
Reporter: Change.org Assignee: Bernie Hackett
Resolution: Done Votes: 0
Labels: None
Remaining Estimate: Not Specified
Time Spent: Not Specified
Original Estimate: Not Specified
Environment:

Gentoo (EngineYard) on AWS. 3 replicas and an arbiter.


Attachments: Text File crash.txt     Text File mongodb.log     Text File mongodb.log    
Operating System: Linux
Participants:

 Description   

I'm keeping one of my sites alive with monit to restart mongod. I'm getting SEGV failures every few hours since upgrading to 2.0.2 from 2.0.1.

Short term, I need to know if it is safe to downgrade back to 2.0.1.



 Comments   
Comment by Bernie Hackett [ 21/Jan/12 ]

Those warning messages are harmless. Good to hear that the regular binaries are working well. We're in touch with EY about the chef recipe.

I'm gonna go ahead and close this ticket. Don't hesitate to reopen if this happens again.

Comment by Change.org [ 20/Jan/12 ]

The new binaries seem stable.

Granted, now I'm getting tons of these in my primary:

Fri Jan 20 11:00:31 [conn26818] command petition-online-production.$cmd command: { count: "petitions", query:

{ deleted: false, close_date: null, disabled: null }

, fields: null } ntoreturn:1 reslen:48 382ms
Fri Jan 20 11:00:31 [conn26479] xxx.signatures warning: cursor loc null does not match byLoc position 16:5d75c04c !
Fri Jan 20 11:00:32 [conn26479] xxx.signatures warning: cursor loc null does not match byLoc position 16:5d75f638 !
Fri Jan 20 11:00:33 [conn26479] xxx.signatures warning: cursor loc null does not match byLoc position 16:5d761bc0 !
Fri Jan 20 11:00:35 [conn26479] xxx.signatures warning: cursor loc 16:5d7647a0 does not match byLoc position 16:5d7645e0 !
Fri Jan 20 11:00:37 [conn26479] xxx.signatures warning: cursor loc null does not match byLoc position 16:5d766a98 !
Fri Jan 20 11:00:37 [conn26479] xxx.signatures warning: cursor loc null does not match byLoc position 16:5d767184 !
Fri Jan 20 11:00:38 [conn26479] xxx.signatures warning: cursor loc null does not match byLoc position 16:5d768568 !
Fri Jan 20 11:00:39 [conn26479] xxx.signatures warning: cursor loc null does not match byLoc position 16:5d76909c !
Fri Jan 20 11:00:40 [conn26479] xxx.signatures warning: cursor loc null does not match byLoc position 16:5d769ec4 !

Possibly unrelated, but troubling.

Comment by Change.org [ 20/Jan/12 ]

Pull request issued for EY: https://github.com/engineyard/ey-cloud-recipes/pull/28

Comment by Change.org [ 20/Jan/12 ]

Looks like the non-static build is working well. It has been running on all 3 servers without fail for several hours now. If it survives the night, we may be able to close this. I'll let you know.

I ended up with the static versions because I started with an EY chef recipe:
https://github.com/engineyard/ey-cloud-recipes/blob/master/cookbooks/mongodb/attributes/recipe.rb
I've just been bumping version numbers. I'll write up a pull request for them, but it may be affecting other EY customers that 10gen has. Just so you know...

Comment by Bernie Hackett [ 20/Jan/12 ]

@kyle, let us know if the regular builds work correctly or if you still have the same issues.

Comment by Eliot Horowitz (Inactive) [ 20/Jan/12 ]

Is there a reason you're using the legacy static builds?
On some systems those can cause problems.

Comment by Eliot Horowitz (Inactive) [ 19/Jan/12 ]

Can you send more of the log?

Comment by Eliot Horowitz (Inactive) [ 19/Jan/12 ]

Yes - it is safe to downgrade to 2.0.1

Generated at Thu Feb 08 03:06:48 UTC 2024 using Jira 9.7.1#970001-sha1:2222b88b221c4928ef0de3161136cc90c8356a66.