The recipient side of chunk migration sometimes fails to confirm that it is executing on a replica set primary node before performing writes to replicated collections. This can happen if the recipient node steps down from state PRIMARY to state SECONDARY during a chunk migration, at certain key times. This can happen at least during step 1 of MigrateStatus::_go, "copy indexes," but an audit of the entire method and its callees is required.
2015-01-29T02:35:03.182-0500 [conn1905205] SocketException handling request, closing client connection: 9001 socket exception [SEND_ERROR] server [10.218.0.30:35589] 2015-01-29T02:35:03.161-0500 [conn1927590] SocketException handling request, closing client connection: 9001 socket exception [SEND_ERROR] server [10.90.0.150:57092] something" } ninserted:0 keyUpdates:0 exception: Not primary while writing to XXX.WORK_ERROR_LOG code:10107 numYields:0 locks(micros) w:11 4507960ms 2015-01-29T02:35:03.424-0500 [conn1910349] command XXX.$cmd command: update { $msg: "query not recording (too large)" } ntoreturn:1 keyUpdates:0 numYields:0 2015-01-29T02:35:03.682-0500 [conn1906748] SocketException handling request, closing client connection: 9001 socket exception [SEND_ERROR] server [10.90.0.150:47380] 2exception: Not primary while updating XXX.WORK code:10107 numYields:0 locks(micros) w:2029 4288792ms 2015-01-29T02:35:03.688-0500 [conn1906382] SocketException handling request, closing client connection: 9001 socket exception [SEND_ERROR] server [10.218.0.30:36001] 2015-01-29T02:35:03.688-0500 [conn1928623] command XXX.$cmd command: update { $msg: "query not recording (too large)" } ntoreturn:1 keyUpdates:0 numYields:0 reslen:194 4542532ms 2015-01-29T02:35:03.688-0500 [conn1907169] command XXX.$cmd command: insert { $msg: "query not recording (too large)" } ntoreturn:1 keyUpdates:0 numYields:0 reslen:191 19015097ms 2015-01-29T02:35:03.520-0500 [conn1914086] command XXX.$cmd command: update { $msg: "query not recording (too large)" } ntoreturn:1 keyUpdates:0 numYields:0 reslen:194 17432731ms 2015-01-29T02:35:03.689-0500 [conn1928626] command XXX.$cmd command: insert { $msg: "query not recording (too large)" } ntoreturn:1 keyUpdates:0 numYields:0 reslen:191 4542245ms 2015-01-29T02:35:03.692-0500 [conn1923358] SocketException handling request, closing client connection: 9001 socket exception [SEND_ERROR] server [10.90.0.150:55250] 2015-01-29T02:35:03.692-0500 [conn1904900] command XXX.$cmd command: insert { $msg: "query not recording (too large)" } ntoreturn:1 keyUpdates:0 numYields:0 reslen:191 17324862ms 2015-01-29T02:35:06.214-0500 [migrateThread] SEVERE: Got signal: 6 (Aborted). Backtrace:0x122d881 0x122cc5e 0x3f86c302d0 0x3f86c30265 0x3f86c31d10 0x11abc9a 0xea71ea 0xea286e 0x1023e10 0x1029888 0x1015a69 0x12724d9 0x3f87c0683d 0x3f86cd526d /usr/bin/mongod(_ZN5mongo15printStackTraceERSo+0x21) [0x122d881] /usr/bin/mongod [0x122cc5e] /lib64/libc.so.6 [0x3f86c302d0] /lib64/libc.so.6(gsignal+0x35) [0x3f86c30265] /lib64/libc.so.6(abort+0x110) [0x3f86c31d10] /usr/bin/mongod(_ZN5mongo13fassertFailedEi+0x13a) [0x11abc9a] /usr/bin/mongod [0xea71ea] /usr/bin/mongod(_ZN5mongo5logOpEPKcS1_RKNS_7BSONObjEPS2_PbbPS3_+0xee) [0xea286e] /usr/bin/mongod(_ZN5mongo13MigrateStatus3_goEv+0x9f0) [0x1023e10] /usr/bin/mongod(_ZN5mongo13MigrateStatus2goEv+0x28) [0x1029888] /usr/bin/mongod(_ZN5mongo13migrateThreadEv+0x59) [0x1015a69] /usr/bin/mongod [0x12724d9] /lib64/libpthread.so.0 [0x3f87c0683d] /lib64/libc.so.6(clone+0x6d) [0x3f86cd526d]
- is related to
-
SERVER-17150 logOp fassert when dropping collection on stepped-down primary
- Closed