[SERVER-17163] Fatal error "logOp but not primary" in MigrateStatus::go Created: 03/Feb/15  Updated: 26/Feb/15  Resolved: 11/Feb/15

Status: Closed
Project: Core Server
Component/s: Sharding
Affects Version/s: 2.6.1
Fix Version/s: 2.6.8

Type: Bug Priority: Major - P3
Reporter: Alex Lerner Assignee: Benety Goh
Resolution: Done Votes: 1
Labels: None
Remaining Estimate: Not Specified
Time Spent: Not Specified
Original Estimate: Not Specified

Issue Links:
Related
is related to SERVER-17150 logOp fassert when dropping collectio... Closed
Backwards Compatibility: Fully Compatible
Operating System: ALL
Participants:

 Description   

The recipient side of chunk migration sometimes fails to confirm that it is executing on a replica set primary node before performing writes to replicated collections. This can happen if the recipient node steps down from state PRIMARY to state SECONDARY during a chunk migration, at certain key times. This can happen at least during step 1 of MigrateStatus::_go, "copy indexes," but an audit of the entire method and its callees is required.

2015-01-29T02:35:03.182-0500 [conn1905205] SocketException handling request, closing client connection: 9001 socket exception [SEND_ERROR] server [10.218.0.30:35589] 
2015-01-29T02:35:03.161-0500 [conn1927590] SocketException handling request, closing client connection: 9001 socket exception [SEND_ERROR] server [10.90.0.150:57092] 
something" } ninserted:0 keyUpdates:0 exception: Not primary while writing to XXX.WORK_ERROR_LOG code:10107 numYields:0 locks(micros) w:11 4507960ms
2015-01-29T02:35:03.424-0500 [conn1910349] command XXX.$cmd command: update { $msg: "query not recording (too large)" } ntoreturn:1 keyUpdates:0 numYields:0  
2015-01-29T02:35:03.682-0500 [conn1906748] SocketException handling request, closing client connection: 9001 socket exception [SEND_ERROR] server [10.90.0.150:47380] 
 
2exception: Not primary while updating XXX.WORK code:10107 numYields:0 locks(micros) w:2029 4288792ms
2015-01-29T02:35:03.688-0500 [conn1906382] SocketException handling request, closing client connection: 9001 socket exception [SEND_ERROR] server [10.218.0.30:36001] 
2015-01-29T02:35:03.688-0500 [conn1928623] command XXX.$cmd command: update { $msg: "query not recording (too large)" } ntoreturn:1 keyUpdates:0 numYields:0  reslen:194 4542532ms
2015-01-29T02:35:03.688-0500 [conn1907169] command XXX.$cmd command: insert { $msg: "query not recording (too large)" } ntoreturn:1 keyUpdates:0 numYields:0  reslen:191 19015097ms
2015-01-29T02:35:03.520-0500 [conn1914086] command XXX.$cmd command: update { $msg: "query not recording (too large)" } ntoreturn:1 keyUpdates:0 numYields:0  reslen:194 17432731ms
2015-01-29T02:35:03.689-0500 [conn1928626] command XXX.$cmd command: insert { $msg: "query not recording (too large)" } ntoreturn:1 keyUpdates:0 numYields:0  reslen:191 4542245ms
 
2015-01-29T02:35:03.692-0500 [conn1923358] SocketException handling request, closing client connection: 9001 socket exception [SEND_ERROR] server [10.90.0.150:55250] 
2015-01-29T02:35:03.692-0500 [conn1904900] command XXX.$cmd command: insert { $msg: "query not recording (too large)" } ntoreturn:1 keyUpdates:0 numYields:0  reslen:191 17324862ms
2015-01-29T02:35:06.214-0500 [migrateThread] SEVERE: Got signal: 6 (Aborted).
 
Backtrace:0x122d881 0x122cc5e 0x3f86c302d0 0x3f86c30265 0x3f86c31d10 0x11abc9a 0xea71ea 0xea286e 0x1023e10 0x1029888 0x1015a69 0x12724d9 0x3f87c0683d 0x3f86cd526d 
 /usr/bin/mongod(_ZN5mongo15printStackTraceERSo+0x21) [0x122d881]
 /usr/bin/mongod [0x122cc5e]
 /lib64/libc.so.6 [0x3f86c302d0]
 /lib64/libc.so.6(gsignal+0x35) [0x3f86c30265]
 /lib64/libc.so.6(abort+0x110) [0x3f86c31d10]
 /usr/bin/mongod(_ZN5mongo13fassertFailedEi+0x13a) [0x11abc9a]
 /usr/bin/mongod [0xea71ea]
 /usr/bin/mongod(_ZN5mongo5logOpEPKcS1_RKNS_7BSONObjEPS2_PbbPS3_+0xee) [0xea286e]
 /usr/bin/mongod(_ZN5mongo13MigrateStatus3_goEv+0x9f0) [0x1023e10]
 /usr/bin/mongod(_ZN5mongo13MigrateStatus2goEv+0x28) [0x1029888]
 /usr/bin/mongod(_ZN5mongo13migrateThreadEv+0x59) [0x1015a69]
 /usr/bin/mongod [0x12724d9]
 /lib64/libpthread.so.0 [0x3f87c0683d]
 /lib64/libc.so.6(clone+0x6d) [0x3f86cd526d]



 Comments   
Comment by Githook User [ 11/Feb/15 ]

Author:

{u'username': u'benety', u'name': u'Benety Goh', u'email': u'benety@mongodb.com'}

Message: SERVER-17163 cancel chunk migration on the primary when the node is stepped down
Branch: v2.6
https://github.com/mongodb/mongo/commit/acf726499561a6ed34486bd4efcbf615405f93f8

Generated at Thu Feb 08 03:43:30 UTC 2024 using Jira 9.7.1#970001-sha1:2222b88b221c4928ef0de3161136cc90c8356a66.