[SERVER-11034] Provide better message on wassert(d.size() < 1024) Created: 03/Oct/13  Updated: 12/Nov/14  Resolved: 17/Oct/14

Status: Closed
Project: Core Server
Component/s: Internal Code, Replication
Affects Version/s: None
Fix Version/s: None

Type: Improvement Priority: Minor - P4
Reporter: Scott Hernandez (Inactive) Assignee: Scott Hernandez (Inactive)
Resolution: Duplicate Votes: 2
Labels: elections
Remaining Estimate: Not Specified
Time Spent: Not Specified
Original Estimate: Not Specified

Issue Links:
Duplicate
duplicates SERVER-14561 Implement Heartbeat callback and sche... Closed
is duplicated by SERVER-12105 Make logging of "warning assertion fa... Closed
Related
is related to DOCS-928 Documentation around assertion failur... Closed
Backwards Compatibility: Fully Compatible
Participants:

 Description   

This is not a very helpful error message when message processing backs up. In ReplSetHealthPollTask::up -> Manager::msgCheckNewState.

 [rsHealthPoll]   warning assertion failure d.size() < 1024 src/mongo/util/concurrency/task.cpp 122
0xaf8c41 0xabe07a 0xac681b 0x94dd6a 0x950553 0xac65ce 0xac1cfe 0xac3344 0xb3ec79 0x35062077e1 0x3505ae18ed 
 /usr/bin/mongod(_ZN5mongo15printStackTraceERSo+0x21) [0xaf8c41]
 /usr/bin/mongod(_ZN5mongo9wassertedEPKcS1_j+0x11a) [0xabe07a]
 /usr/bin/mongod(_ZN5mongo4task6Server4sendEN5boost8functionIFvvEEE+0x19b) [0xac681b]
 /usr/bin/mongod(_ZN5mongo21ReplSetHealthPollTask2upERKNS_7BSONObjERNS_13HeartbeatInfoE+0xada) [0x94dd6a]
 /usr/bin/mongod(_ZN5mongo21ReplSetHealthPollTask6doWorkEv+0xd3) [0x950553]
 /usr/bin/mongod(_ZN5mongo4task4Task3runEv+0x1e) [0xac65ce]
 /usr/bin/mongod(_ZN5mongo13BackgroundJob7jobBodyEN5boost10shared_ptrINS0_9JobStatusEEE+0xbe) [0xac1cfe]
 /usr/bin/mongod(_ZN5boost6detail11thread_dataINS_3_bi6bind_tIvNS_4_mfi3mf1IvN5mongo13BackgroundJobENS_10shared_ptrINS7_9JobStatusEEEEENS2_5list2INS2_5valueIPS7_EENSD_ISA_EEEEEEE3runEv+0x74) [0xac3344]
 /usr/bin/mongod() [0xb3ec79]
 /lib64/libpthread.so.0() [0x35062077e1]
 /lib64/libc.so.6(clone+0x6d) [0x3505ae18ed]
Wed Dec 19 14:26:58 [rsHealthPoll] rate limiting wassert



 Comments   
Comment by Scott Hernandez (Inactive) [ 17/Oct/14 ]

Since the changes we made with the replication refactoring there is no longer a queue, and heartbeats no longer block on write operations, so this error is gone.

Comment by Jordan Appleson [ 18/Sep/14 ]

Is this issue related to the following? Unsure if I should open a separate JIRA Bug. I've been seeing this on our secondary nodes while resyncing a couple of hundred gigabytes.

2014-09-18T01:08:28.478+0100 [conn109] warning assertion failure d.size() < 1024 src/mongo/util/concurrency/task.cpp 123
2014-09-18T01:08:28.486+0100 [conn109] 0x11e6111 0x1187e49 0x116be5f 0x1175d4b 0xe2e448 0xa2889a 0xa29ce2 0xa2bea6 0xd5dd6d 0xb9fe62 0xba1440 0x770aef 0x119bf3e 0x7f8538da8df3 0x7f85380af01d
 /usr/bin/mongod(_ZN5mongo15printStackTraceERSo+0x21) [0x11e6111]
 /usr/bin/mongod(_ZN5mongo10logContextEPKc+0x159) [0x1187e49]
 /usr/bin/mongod(_ZN5mongo9wassertedEPKcS1_j+0x17f) [0x116be5f]
 /usr/bin/mongod(_ZN5mongo4task6Server4sendEN5boost8functionIFvvEEE+0x19b) [0x1175d4b]
 /usr/bin/mongod(_ZN5mongo19CmdReplSetHeartbeat3runERKSsRNS_7BSONObjEiRSsRNS_14BSONObjBuilderEb+0x1918) [0xe2e448]
 /usr/bin/mongod(_ZN5mongo12_execCommandEPNS_7CommandERKSsRNS_7BSONObjEiRSsRNS_14BSONObjBuilderEb+0x3a) [0xa2889a]
 /usr/bin/mongod(_ZN5mongo7Command11execCommandEPS0_RNS_6ClientEiPKcRNS_7BSONObjERNS_14BSONObjBuilderEb+0x1042) [0xa29ce2]
 /usr/bin/mongod(_ZN5mongo12_runCommandsEPKcRNS_7BSONObjERNS_11_BufBuilderINS_16TrivialAllocatorEEERNS_14BSONObjBuilderEbi+0x6c6) [0xa2bea6]
 /usr/bin/mongod(_ZN5mongo11newRunQueryERNS_7MessageERNS_12QueryMessageERNS_5CurOpES1_+0x22ed) [0xd5dd6d]
 /usr/bin/mongod() [0xb9fe62]
 /usr/bin/mongod(_ZN5mongo16assembleResponseERNS_7MessageERNS_10DbResponseERKNS_11HostAndPortE+0x580) [0xba1440]
 /usr/bin/mongod(_ZN5mongo16MyMessageHandler7processERNS_7MessageEPNS_21AbstractMessagingPortEPNS_9LastErrorE+0x9f) [0x770aef]
 /usr/bin/mongod(_ZN5mongo17PortMessageServer17handleIncomingMsgEPv+0x4ee) [0x119bf3e]
 /lib64/libpthread.so.0(+0x7df3) [0x7f8538da8df3]
 /lib64/libc.so.6(clone+0x6d) [0x7f85380af01d]
2014-09-18T01:08:30.002+0100 [IndexRebuilder]           Index Build: 102290900/254526079        40%
2014-09-18T01:08:33.001+0100 [IndexRebuilder]           Index Build: 102830400/254526079        40%
2014-09-18T01:08:34.490+0100 [conn109] warning assertion failure d.size() < 1024 src/mongo/util/concurrency/task.cpp 123
2014-09-18T01:08:34.498+0100 [conn109] 0x11e6111 0x1187e49 0x116be5f 0x1175d4b 0xe2e448 0xa2889a 0xa29ce2 0xa2bea6 0xd5dd6d 0xb9fe62 0xba1440 0x770aef 0x119bf3e 0x7f8538da8df3 0x7f85380af01d
 /usr/bin/mongod(_ZN5mongo15printStackTraceERSo+0x21) [0x11e6111]
 /usr/bin/mongod(_ZN5mongo10logContextEPKc+0x159) [0x1187e49]
 /usr/bin/mongod(_ZN5mongo9wassertedEPKcS1_j+0x17f) [0x116be5f]
 /usr/bin/mongod(_ZN5mongo4task6Server4sendEN5boost8functionIFvvEEE+0x19b) [0x1175d4b]
 /usr/bin/mongod(_ZN5mongo19CmdReplSetHeartbeat3runERKSsRNS_7BSONObjEiRSsRNS_14BSONObjBuilderEb+0x1918) [0xe2e448]
 /usr/bin/mongod(_ZN5mongo12_execCommandEPNS_7CommandERKSsRNS_7BSONObjEiRSsRNS_14BSONObjBuilderEb+0x3a) [0xa2889a]
 /usr/bin/mongod(_ZN5mongo7Command11execCommandEPS0_RNS_6ClientEiPKcRNS_7BSONObjERNS_14BSONObjBuilderEb+0x1042) [0xa29ce2]
 /usr/bin/mongod(_ZN5mongo12_runCommandsEPKcRNS_7BSONObjERNS_11_BufBuilderINS_16TrivialAllocatorEEERNS_14BSONObjBuilderEbi+0x6c6) [0xa2bea6]
 /usr/bin/mongod(_ZN5mongo11newRunQueryERNS_7MessageERNS_12QueryMessageERNS_5CurOpES1_+0x22ed) [0xd5dd6d]
 /usr/bin/mongod() [0xb9fe62]
 /usr/bin/mongod(_ZN5mongo16assembleResponseERNS_7MessageERNS_10DbResponseERKNS_11HostAndPortE+0x580) [0xba1440]
 /usr/bin/mongod(_ZN5mongo16MyMessageHandler7processERNS_7MessageEPNS_21AbstractMessagingPortEPNS_9LastErrorE+0x9f) [0x770aef]
 /usr/bin/mongod(_ZN5mongo17PortMessageServer17handleIncomingMsgEPv+0x4ee) [0x119bf3e]
 /lib64/libpthread.so.0(+0x7df3) [0x7f8538da8df3]
 /lib64/libc.so.6(clone+0x6d) [0x7f85380af01d]

Comment by Eric Milkie [ 14/Aug/14 ]

Rearchitecture is coming with the replication refactor; should solve this.

Comment by Eric Milkie [ 04/Oct/13 ]

Ideally, we'd rearchitect so this never happens.

Generated at Thu Feb 08 03:24:41 UTC 2024 using Jira 9.7.1#970001-sha1:2222b88b221c4928ef0de3161136cc90c8356a66.