[SERVER-36474] Cannot initiate a replica set if free monitoring is disabled at command-line Created: 06/Aug/18  Updated: 29/Oct/23  Resolved: 27/Aug/18

Status: Closed
Project: Core Server
Component/s: None
Affects Version/s: 4.0.0
Fix Version/s: 4.0.3, 4.1.3

Type: Bug Priority: Major - P3
Reporter: Mark Benvenuto Assignee: Mark Benvenuto
Resolution: Fixed Votes: 2
Labels: None
Remaining Estimate: Not Specified
Time Spent: Not Specified
Original Estimate: Not Specified

Issue Links:
Backports
Duplicate
is duplicated by SERVER-36578 Segmentation fault when disabling mon... Closed
is duplicated by SERVER-36823 SeqFault on ReplicaSet Vanilla Setup Closed
Backwards Compatibility: Fully Compatible
Operating System: ALL
Backport Requested:
v4.0
Sprint: Platforms 2018-08-27
Participants:

 Description   

If a node is started up with "--replSet free --enableFreeMonitoring=off" and the rs.initiate() is called, mongod crashes.



 Comments   
Comment by Githook User [ 29/Aug/18 ]

Author:

{'name': 'Mark Benvenuto', 'email': 'mark.benvenuto@mongodb.com', 'username': 'markbenvenuto'}

Message: SERVER-36474 Cannot initiate a replica set if free monitoring is disabled at command-line
Branch: v4.0
https://github.com/mongodb/mongo/commit/7cc66becc499651ac933ffbd57dd892c9fd96ffe

Comment by Githook User [ 27/Aug/18 ]

Author:

{'name': 'Mark Benvenuto', 'email': 'mark.benvenuto@mongodb.com', 'username': 'markbenvenuto'}

Message: SERVER-36474 Cannot initiate a replica set if free monitoring is disabled at command-line
Branch: master
https://github.com/mongodb/mongo/commit/402fa5cc4a5f74cd7e592d9273a94e7cf25446b6

Comment by NOVALUE Mitar [ 10/Aug/18 ]

Ben, I am glad that we are experiencing this issue at the same time. (Missed this issue and reported a duplicate: https://jira.mongodb.org/browse/SERVER-36578)

Comment by Ben Newman [ 09/Aug/18 ]

I'm running these commands on OSX/darwin, by the way.

On Linux, the same command gives

Error parsing command line: unrecognised option '--enableFreeMonitoring'
try '/home/ben/meteor/dev_bundle/mongodb/bin/mongod --help' for more information

Despite being the same version of mongod:

$ mongod --version                                                                                                                             
db version v4.0.0
git version: 3b07af3d4f471ae89e8186d33bbb1d5259597d51
allocator: tcmalloc
modules: none
build environment:
    distarch: x86_64
    target_arch: x86_64

Something tells me this free monitoring feature wasn't fully tested before it was included in the 4.0 release.

Looking forward to the next update.

Comment by Ben Newman [ 09/Aug/18 ]

I tried running the command in lldb, and it seems to be crashing on this line:

https://github.com/mongodb/mongo/blob/2c752e43b73692c70157226e1e62ae16fb2491ec/src/mongo/db/free_mon/free_mon_controller.cpp#L130

void FreeMonController::_enqueue(std::shared_ptr<FreeMonMessage> msg) {
    {
        stdx::lock_guard<stdx::mutex> lock(_mutex); // <== here
        invariant(_state == State::kStarted);
    }
 
    _processor->enqueue(std::move(msg));
}

 

 

Comment by Ben Newman [ 09/Aug/18 ]

Also seeing this behavior with Mongo 4.0.0:

mongod --bind_ip 127.0.0.1 --port 3001 --dbpath /Users/ben/dev/mongo4-test/.meteor/local/db --oplogSize 8 --replSet meteor --noauth

works as usual, but

mongod --bind_ip 127.0.0.1 --port 3001 --dbpath /Users/ben/dev/mongo4-test/.meteor/local/db --oplogSize 8 --replSet meteor --noauth --enableFreeMonitoring off 

crashes with a segmentation fault:

2018-08-09T14:29:01.726-0400 F -        [rsSync-0] Invalid access at address: 0x8
2018-08-09T14:29:01.735-0400 F -        [rsSync-0] Got signal: 11 (Segmentation fault: 11).
 0x102c057f9 0x102c0528a 0x7fff79664f5a 0x7fff7951675a 0x1018a5f5d 0x1018a65d8 0x1017473cf 0x101746ddf 0x10176d63f 0x1017fb25d 0x1017fa8cb 0x1017f6c9d 0x1025ab109 0x1025b0cb6 0x10252af2d 0x10252a565 0x10252a1ed 0x10252bb57 0x7fff7966e661 0x7fff7966e50d 0x7fff7966dbf9
----- BEGIN BACKTRACE -----
{"backtrace":[...]}
mongod(_ZN5mongo15printStackTraceERNSt3__113basic_ostreamIcNS0_11char_traitsIcEEEE+0x39) [0x102c057f9]
mongod(_ZN5mongo12_GLOBAL__N_124abruptQuitWithAddrSignalEiP9__siginfoPv+0x12A) [0x102c0528a]
libsystem_platform.dylib(_sigtramp+0x1A) [0x7fff79664f5a]
libsystem_malloc.dylib(tiny_free_no_lock+0x23A) [0x7fff7951675a]
mongod(_ZN5mongo17FreeMonController8_enqueueENSt3__110shared_ptrINS_14FreeMonMessageEEE+0x2D) [0x1018a5f5d]
mongod(_ZN5mongo17FreeMonController27notifyOnTransitionToPrimaryEv+0x78) [0x1018a65d8]
mongod(_ZN5mongo4repl39ReplicationCoordinatorExternalStateImpl34_shardingOnTransitionToPrimaryHookEPNS_16OperationContextE+0x50F) [0x1017473cf]
mongod(_ZN5mongo4repl39ReplicationCoordinatorExternalStateImpl21onTransitionToPrimaryEPNS_16OperationContextEb+0x19F) [0x101746ddf]
mongod(_ZN5mongo4repl26ReplicationCoordinatorImpl19signalDrainCompleteEPNS_16OperationContextEx+0x14F) [0x10176d63f]
mongod(_ZN5mongo4repl8SyncTail17_oplogApplicationEPNS0_11OplogBufferEPNS0_22ReplicationCoordinatorEPNS1_14OpQueueBatcherE+0x93D) [0x1017fb25d]
mongod(_ZN5mongo4repl8SyncTail16oplogApplicationEPNS0_11OplogBufferEPNS0_22ReplicationCoordinatorE+0x9B) [0x1017fa8cb]
mongod(_ZNSt3__110__function6__funcIZN5mongo4repl12OplogApplier7startupEvE3$_0NS_9allocatorIS5_EEFvRKNS2_8executor12TaskExecutor12CallbackArgsEEEclESC_+0xAD) [0x1017f6c9d]
mongod(_ZN5mongo8executor22ThreadPoolTaskExecutor11runCallbackENSt3__110shared_ptrINS1_13CallbackStateEEE+0x159) [0x1025ab109]
mongod(_ZNSt3__110__function6__funcIZN5mongo8executor22ThreadPoolTaskExecutor23scheduleIntoPool_inlockEPNS_4listINS_10shared_ptrINS4_13CallbackStateEEENS_9allocatorIS8_EEEERKNS_15__list_iteratorIS8_PvEESH_NS_11unique_lockINS_5mutexEEEE3$_5NS9_ISL_EEFvvEEclEv+0x46) [0x1025b0cb6]
mongod(_ZN5mongo10ThreadPool10_doOneTaskEPNSt3__111unique_lockINS1_5mutexEEE+0x24D) [0x10252af2d]
mongod(_ZN5mongo10ThreadPool13_consumeTasksEv+0x75) [0x10252a565]
mongod(_ZN5mongo10ThreadPool17_workerThreadBodyEPS0_RKNSt3__112basic_stringIcNS2_11char_traitsIcEENS2_9allocatorIcEEEE+0x13D) [0x10252a1ed]
mongod(_ZNSt3__114__thread_proxyINS_5tupleIJZN5mongo10ThreadPool25_startWorkerThread_inlockEvE3$_2EEEEEPvS6_+0x67) [0x10252bb57]
libsystem_pthread.dylib(_pthread_body+0x154) [0x7fff7966e661]
libsystem_pthread.dylib(_pthread_body+0x0) [0x7fff7966e50d]
libsystem_pthread.dylib(thread_start+0xD) [0x7fff7966dbf9]
-----  END BACKTRACE  -----

 

Generated at Thu Feb 08 04:43:13 UTC 2024 using Jira 9.7.1#970001-sha1:2222b88b221c4928ef0de3161136cc90c8356a66.