[SERVER-18521] replica in STARTUP2 state cannot be stopped Created: 18/May/15  Updated: 09/Jun/15  Resolved: 09/Jun/15

Status: Closed
Project: Core Server
Component/s: Admin
Affects Version/s: 3.0.3
Fix Version/s: None

Type: Bug Priority: Major - P3
Reporter: Alex Kotenko Assignee: Ramon Fernandez Marina
Resolution: Duplicate Votes: 0
Labels: None
Remaining Estimate: Not Specified
Time Spent: Not Specified
Original Estimate: Not Specified

Issue Links:
Duplicate
duplicates SERVER-17464 Check for shutdown state more regular... Closed
Related
related to SERVER-4766 Make initial sync restartable per col... Closed
Operating System: ALL
Participants:

 Description   

Once entered a STARTUP2 server cannot be stopped correctly, only killed via `kill -9 <pid>`. It probably will stop gracefully once through it's STARTUP2 phase and into RECOVERY or SECONDARY, but STARTUP2 may be quite a lengthy process (days) if large databases are to be replicated.
Also once restarted after unclean shutdown replica Secondary wipes out all halfreplicated files and starts over again from scratch.
This makes replication essentially one gigantic atomic operation, exteremely lengthy, unmanageable and unreliable.
This behaviour should be either documented or (ideally) fixed, making initial replication interuptable process.



 Comments   
Comment by Scott Hernandez (Inactive) [ 09/Jun/15 ]

Our current replication system ensures complete replication of all collections and databases with a single serial order of writes via the oplog for the whole replica set. This requires that any failure in any collection can stop replication. In addition, a fatal failure in the storage system in a replica should only cause that single instance, not the whole replica set, to fail. The initial sync process when bring up or adding a node, unfortunately can expose data corruption in the source replica – this can be very problematic when there are small numbers of replicas.

SERVER-17464 targets improving the initial sync shutdown responsiveness, and SERVER-4766 would make initial sync a re-startable process. Please feel free to comment and vote on those tickets, or watch to keep be aware of any updates and progress.

I am closing this as duplicate since we have the issues covered elsewhere.

Comment by Alex Kotenko [ 19/May/15 ]

Not worth the hassle, I'm not much worried with files corruption as such, I'm much more bothered with the fact that every single corrupted file is not handled in isolation affecting only certain collections, but causes the whole system to crash, rollback and start from scratch. That imposes a serious problem to Mongo as a database system.

Comment by Scott Hernandez (Inactive) [ 19/May/15 ]

It looks like some form of data corruption in the files. If you want to file a new server issue we can take a look but it would require getting access to the files and history of the replica, its hardware, storage and such to get an idea what happened when. For corruption like this it is very hard sometimes to find the event in the os/filesystem/server where the bad event happened since until the database process accesses the data you will not notice – this is the same as I've seen on many file servers where infrequently accessed files have corruption but aren't noticed until backup/archival disaster recovery testing.

Comment by Alex Kotenko [ 19/May/15 ]

Ok, I'll get that reporduced and logged tomorrow.

Btw, my secondary mongodb 3.0.3 instance has just crashed again (4th time) with identical exception, see below:

 0xf51949 0xef1671 0xed693f 0xed69ec 0x834adf 0x915f72 0x9132ab 0xa0fcb5 0xa3bd4c 0xbd26a4 0xbd2a54 0xbd308d 0xb23a21 0xc3cf6a 0xc944be 0xc96
ca5 0xee848b 0xf9f9b4 0x7f4c36fd1182 0x7f4c35a9947d
----- BEGIN BACKTRACE -----
{"backtrace":[{"b":"400000","o":"B51949"},{"b":"400000","o":"AF1671"},{"b":"400000","o":"AD693F"},{"b":"400000","o":"AD69EC"},{"b":"400000","o":"434ADF"},{"b":"400000","o":"515F72"},{"b":"400000","o":"5132AB"},{"b":"400000","o":"60FCB5"},{"b":"400000","o":"63BD4C"},{"b":"400000","o":"7D26A4"},{"b":"400000","o":"7D2A54"},{"b":"400000","o":"7D308D"},{"b":"400000","o":"723A21"},{"b":"400000","o":"83CF6A"},{"b":"400000","o":"8944BE"},{"b":"400000","o":"896CA5"},{"b":"400000","o":"AE848B"},{"b":"400000","o":"B9F9B4"},{"b":"7F4C36FC9000","o":"8182"},{"b":"7F4C3599F000","o":"FA47D"}],"processInfo":{ "mongodbVersion" : "3.0.3", "gitVersion" : "b40106b36eecd1b4407eb1ad1af6bc60593c6105", "uname" : { "sysname" : "Linux", "release" : "3.16.0-37-generic", "version" : "#51~14.04.1-Ubuntu SMP Wed May 6 15:23:14 UTC 2015", "machine" : "x86_64" }, "somap" : [ { "elfType" : 2, "b" : "400000", "buildId" : "F56F80CB96B4DBFC070BEB0ADAC7D6B274BFC6B1" }, { "b" : "7FFC4F865000", "elfType" : 3, "buildId" : "EFA4172E98FE6C90166BC69E025FF58D9E646CC1" }, { "b" : "7F4C36FC9000", "path" : "/lib/x86_64-linux-gnu/libpthread.so.0", "elfType" : 3, "buildId" : "9318E8AF0BFBE444731BB0461202EF57F7C39542" }, { "b" : "7F4C36D6B000", "path" : "/lib/x86_64-linux-gnu/libssl.so.1.0.0", "elfType" : 3, "buildId" : "FF43D0947510134A8A494063A3C1CF3CEBB27791" }, { "b" : "7F4C36990000", "path" : "/lib/x86_64-linux-gnu/libcrypto.so.1.0.0", "elfType" : 3, "buildId" : "B927879B878D90DD9FF4B15B00E7799AA8E0272F" }, { "b" : "7F4C36788000", "path" : "/lib/x86_64-linux-gnu/librt.so.1", "elfType" : 3, "buildId" : "92FCF41EFE012D6186E31A59AD05BDBB487769AB" }, { "b" : "7F4C36584000", "path" : "/lib/x86_64-linux-gnu/libdl.so.2", "elfType" : 3, "buildId" : "C1AE4CB7195D337A77A3C689051DABAA3980CA0C" }, { "b" : "7F4C36280000", "path" : "/usr/lib/x86_64-linux-gnu/libstdc++.so.6", "elfType" : 3, "buildId" : "19EFDDAB11B3BF5C71570078C59F91CF6592CE9E" }, { "b" : "7F4C35F7A000", "path" : "/lib/x86_64-linux-gnu/libm.so.6", "elfType" : 3, "buildId" : "1D76B71E905CB867B27CEF230FCB20F01A3178F5" }, { "b" : "7F4C35D64000", "path" : "/lib/x86_64-linux-gnu/libgcc_s.so.1", "elfType" : 3, "buildId" : "8D0AA71411580EE6C08809695C3984769F25725B" }, { "b" : "7F4C3599F000", "path" : "/lib/x86_64-linux-gnu/libc.so.6", "elfType" : 3, "buildId" : "30C94DC66A1FE95180C3D68D2B89E576D5AE213C" }, { "b" : "7F4C371E7000", "path" : "/lib64/ld-linux-x86-64.so.2", "elfType" : 3, "buildId" : "9F00581AB3C73E3AEA35995A0C50D24D59A01D47" } ] }}
 mongod(_ZN5mongo15printStackTraceERSo+0x29) [0xf51949]
 mongod(_ZN5mongo10logContextEPKc+0xE1) [0xef1671]
 mongod(_ZN5mongo11msgassertedEiPKc+0xAF) [0xed693f]
 mongod(+0xAD69EC) [0xed69ec]
 mongod(_ZNK5mongo7BSONObj14_assertInvalidEv+0x3DF) [0x834adf]
 mongod(_ZN5mongo10RecordData13releaseToBsonEv+0x52) [0x915f72]
 mongod(_ZNK5mongo10Collection6docForEPNS_16OperationContextERKNS_8RecordIdE+0x2B) [0x9132ab]
 mongod(_ZN5mongo11IDHackStage4workEPm+0x185) [0xa0fcb5]
 mongod(_ZN5mongo11UpdateStage4workEPm+0x7C) [0xa3bd4c]
 mongod(_ZN5mongo12PlanExecutor18getNextSnapshottedEPNS_11SnapshottedINS_7BSONObjEEEPNS_8RecordIdE+0xA4) [0xbd26a4]
 mongod(_ZN5mongo12PlanExecutor7getNextEPNS_7BSONObjEPNS_8RecordIdE+0x34) [0xbd2a54]
 mongod(_ZN5mongo12PlanExecutor11executePlanEv+0x3D) [0xbd308d]
 mongod(_ZN5mongo6updateEPNS_16OperationContextEPNS_8DatabaseERKNS_13UpdateRequestEPNS_7OpDebugE+0x111) [0xb23a21]
 mongod(_ZN5mongo4repl21applyOperation_inlockEPNS_16OperationContextEPNS_8DatabaseERKNS_7BSONObjEbb+0x12CA) [0xc3cf6a]
 mongod(_ZN5mongo4repl8SyncTail9syncApplyEPNS_16OperationContextERKNS_7BSONObjEb+0x2EE) [0xc944be]
 mongod(_ZN5mongo4repl14multiSyncApplyERKSt6vectorINS_7BSONObjESaIS2_EEPNS0_8SyncTailE+0x65) [0xc96ca5]
 mongod(_ZN5mongo10threadpool6Worker4loopERKSs+0x2FB) [0xee848b]
 mongod(+0xB9F9B4) [0xf9f9b4]
 libpthread.so.0(+0x8182) [0x7f4c36fd1182]
 libc.so.6(clone+0x6D) [0x7f4c35a9947d]
-----  END BACKTRACE  -----
2015-05-19T15:06:02.507+0000 E REPL     [repl writer worker 6] writer worker caught exception:  :: caused by :: 10334 BSONObj size: 65990382 (0x3EEEEEE) is invalid. Size must be between 0 and 16793600(16MB) First element: id: ?type=95 on: { ts: Timestamp 1432021029000|352, h: 4718410809781437365, v: 2, op: "i", ns: "statsAPI2.basic.campaign_statistics.campaign_1120", o: { _id: { date: "2015-05-18", banner: "1", country: "NO", offer: 11645.0, source: 5990.0, click: 1.0 }, value: { clicks: 2.0, redirects: 2.0, cost: 0.0, source_clicks: 2.909090909090909, source_impressions: 584.363636, source_real_impressions: 0.0, source_cost: 0.011636, source_real_cost: 0.0, conversions: 0.0, revenue: 0.0, source_ipc: 292.181818, source_real_ipc: 0.0, source_cpc: 0.005818, source_real_cpc: 0.0 } } }
2015-05-19T15:06:02.507+0000 I -        [repl writer worker 6] Fatal Assertion 16360
2015-05-19T15:06:02.507+0000 I -        [repl writer worker 6] 
 
***aborting after fassert() failure

This happened multiple times during my current migration process, I handled this by just dropping the problematic collections on the Primary, though that's far not ideal especially considering that these crashes mean full rollback and replication restart from scratch (3Tb, over 30 hours). Also I wonder how this happened in the old DB in the first place, since it has same max BSON object size limit. I'm migrating from 2.6.7 to 3.0.3.
Not sure if this is of interest.

Comment by Scott Hernandez (Inactive) [ 19/May/15 ]

The default log level should be fine. The big parts of initial sync are logged by default.

Comment by Alex Kotenko [ 18/May/15 ]

Shall I get general verbosity level to maximum prior reproducing this, or you'd need some particular areas only?

Comment by Scott Hernandez (Inactive) [ 18/May/15 ]

Alex, I understand it can be frustrating when things are not responsive, and we are working across the board to make improvements. To help us understand what happened can you provide an example, or logs please? Please include logs where you issue the shutdown so we can see what is happening at that time, and after. A list of events, and the commands run would help as well.

It is expected that a failed initial sync must restart from the start and discard partial data, due to how replication and replica sources work.

There are many places during initial sync where there are long running operations which a clean shutdown will wait for. In particular the collection data cloning or index building may wait for each db/collection to finish, before shutting down.

I can assure you that during initial sync (in startup/2) there are many places where a shutdown will abort the process and exit, but we are working to make shutdown more responsive while keeping our initial sync behavior consistent between versions.

Generated at Thu Feb 08 03:47:57 UTC 2024 using Jira 9.7.1#970001-sha1:2222b88b221c4928ef0de3161136cc90c8356a66.