[SERVER-6806] [journal] warning assertion failure a <= 256*1024*1024 src/mongo/util/alignedbuilder.cpp 90 message in secondary servers logs Created: 20/Aug/12  Updated: 26/Feb/14  Resolved: 21/Aug/12

Status: Closed
Project: Core Server
Component/s: Storage
Affects Version/s: 2.2.0-rc1
Fix Version/s: None

Type: Bug Priority: Minor - P4
Reporter: Vladimir Poluyaktov Assignee: Eric Milkie
Resolution: Done Votes: 1
Labels: None
Remaining Estimate: Not Specified
Time Spent: Not Specified
Original Estimate: Not Specified
Environment:

All servers are EC2 hosts, Ubuntu 11.04 (GNU/Linux 2.6.38-13-virtual x86_64)
9 Shard hosts (m2.4xlarge), 3 Replica sets, 3 servers in each
3 config servers (m1.small)
2 mongos hosts (m1.medium)


Attachments: File mongodb.log-20120818.gz    
Issue Links:
Related
related to SERVER-6816 Improve journal data handling after m... Closed
Operating System: Linux
Participants:

 Description   

We found a log of strange warning messages in mongodb.log files on both of our secondary servers:

Sat Aug 11 00:18:30 [rsHealthPoll] warning assertion failure d.size() < 1024 src/mongo/util/concurrency/task.cpp 122

Tue Aug 14 22:35:52 [rsSync] local.oplog.rs warning assertion failure _intents.size() < 2000000 src/mongo/db/dur_commitjob.h 101

Fri Aug 17 17:38:46 [rsSync] local.oplog.rs warning assertion failure _intents.size() < 2000000 src/mongo/db/dur_commitjob.h 101

Fri Aug 17 17:38:53 [journal] warning assertion failure a <= 256*1024*1024 src/mongo/util/alignedbuilder.cpp 90

Here is a full stack trace from one of such messages:

Sat Aug 18 23:21:26 [conn17351] end connection 10.13.34.150:50228 (17 connections now open)
Sat Aug 18 23:21:26 [initandlisten] connection accepted from 10.13.34.150:44379 #17353 (18 connections now open)
Sat Aug 18 23:21:29 [journal] warning assertion failure a <= 256*1024*1024 src/mongo/util/alignedbuilder.cpp 90
0xb400d1 0x68127a 0x792f83 0x6d2635 0x6d2944 0x637bd0 0x6389f9 0x6390a4 0x5f18a9 0x7f6ecec3ed8c 0x7f6ecdfe0c2d
/usr/bin/mongod(_ZN5mongo15printStackTraceERSo+0x21) [0xb400d1]
/usr/bin/mongod(_ZN5mongo9wassertedEPKcS1_j+0x11a) [0x68127a]
/usr/bin/mongod(_ZN5mongo14AlignedBuilder14growReallocateEj+0x63) [0x792f83]
/usr/bin/mongod() [0x6d2635]
/usr/bin/mongod(_ZN5mongo3dur13PREPLOGBUFFERERNS0_11JSectHeaderERNS_14AlignedBuilderE+0x214) [0x6d2944]
/usr/bin/mongod(_ZN5mongo3dur27groupCommitWithLimitedLocksEv+0xa0) [0x637bd0]
/usr/bin/mongod() [0x6389f9]
/usr/bin/mongod(_ZN5mongo3dur9durThreadEv+0x364) [0x6390a4]
/usr/bin/mongod() [0x5f18a9]
/lib/x86_64-linux-gnu/libpthread.so.0(+0x6d8c) [0x7f6ecec3ed8c]
/lib/x86_64-linux-gnu/libc.so.6(clone+0x6d) [0x7f6ecdfe0c2d]
Sat Aug 18 23:21:31 [conn17352] end connection 10.8.233.42:53905 (17 connections now open)
Sat Aug 18 23:21:31 [initandlisten] connection accepted from 10.8.233.42:53907 #17354 (18 connections now open)

Full mongodb.log file attached to this issue.



 Comments   
Comment by Eric Milkie [ 26/Feb/14 ]

Hi Mark.
I believe your problem is related to this ticket: SERVER-12876
We have a proposed code fix for the problem and it should be committed within a day or two. Thanks for reporting this!

Comment by Mark Callaghan [ 26/Feb/14 ]

I just got this crash from 2.5.5 while running compact on a collection that uses ~2G of space. Do you want a new report or details posted here?

Comment by Eric Milkie [ 21/Aug/12 ]

With the new multithreaded batch replication oplog application code, it's now possible to queue up a lot of work for the journaling subsystem. While this would be a bad thing on a primary node, it's less important on a secondary where nothing can be committed to the journal until an entire batch of operations is complete. You can safely ignore these messages and they should be suppressed in a future release.

Generated at Thu Feb 08 03:12:45 UTC 2024 using Jira 9.7.1#970001-sha1:2222b88b221c4928ef0de3161136cc90c8356a66.