[SERVER-36874] Fatal Assertion 40526 while migrating chunks Created: 26/Aug/18  Updated: 20/Mar/19  Resolved: 31/Oct/18

Status: Closed
Project: Core Server
Component/s: Sharding
Affects Version/s: 3.6.5, 3.6.7
Fix Version/s: None

Type: Bug Priority: Critical - P2
Reporter: Elan Kugelmass Assignee: Kelsey Schubert
Resolution: Duplicate Votes: 0
Labels: None
Remaining Estimate: Not Specified
Time Spent: Not Specified
Original Estimate: Not Specified
Environment:

Ubuntu 16.04.4 LTS (GNU/Linux 4.4.0-1065-aws x86_64)


Attachments: Text File mongo_log.txt    
Issue Links:
Duplicate
duplicates SERVER-35323 sessionId matching ignores userId par... Closed
duplicates SERVER-37735 [3.6] Ensure the full logical session... Closed
Operating System: ALL
Participants:

 Description   

We have a four shard cluster undergoing balancing. Each shard is a three member replica set.

Right after a migration occurs, the to-shard's primary occassionally crashes. On 3.6.7, we estimate that this happens in 1 out of 5,000 migrations. Once a single crash happens, the next migration to the same shard (with a different replica node as primary) is likely to lead to a crash as well.

We experienced this problem on 3.6.5 as well, but because SERVER-35658 prevented us from balancing at all, we didn't think much of the issue.

Log from the to-shard primary:

2018-08-26T19:34:32.238+0000 I SHARDING [migrateThread] Starting receiving end of migration of chunk <REDACTED> for collection <REDACTED> from rs0/<REDACTED> at epoch 5b244c7c70330ec95919bfff with session id rs0_rs2_5b8300c7f321a73b73bed889
2018-08-26T19:34:32.238+0000 I NETWORK  [migrateThread] Successfully connected to rs0/<REDACTED> (1 connections now open to rs0/<REDACTED> with a 0 second timeout)
2018-08-26T19:34:32.241+0000 I SHARDING [migrateThread] Scheduling deletion of any documents in <REDACTED> range <REDACTED> before migrating in a chunk covering the range
2018-08-26T19:34:32.242+0000 I SHARDING [Collection Range Deleter] No documents remain to delete in <REDACTED> range <REDACTED>
2018-08-26T19:34:32.242+0000 I SHARDING [Collection Range Deleter] Waiting for majority replication of local deletions in <REDACTED> range <REDACTED>
2018-08-26T19:34:32.242+0000 I SHARDING [Collection Range Deleter] Finished deleting documents in <REDACTED> range <REDACTED>
2018-08-26T19:34:32.242+0000 I SHARDING [migrateThread] Finished deleting <REDACTED> range <REDACTED>
2018-08-26T19:34:32.553+0000 I COMMAND  [sessionCatalogMigration-rs0_rs2_5b8300c7f321a73b73bed889] command local.op
log.rs command: find { find: "oplog.rs", filter: { ts: Timestamp(1535228547, 1972), t: 23 }, ntoreturn: 1, singleBatch
: true, oplogReplay: true, $db: "local" } planSummary: COLLSCAN keysExamined:0 docsExamined:1 cursorExhausted:1 numYie
lds:1 nreturned:1 reslen:1860 locks:{ Global: { acquireCount: { r: 6 } }, Database: { acquireCount: { r: 3 } }, Collec
tion: { acquireCount: { r: 1 } }, oplog: { acquireCount: { r: 2 } } } protocol:op_msg 112ms
2018-08-26T19:34:32.619+0000 F STORAGE  [sessionCatalogMigration-rs0_rs2_5b8300c7f321a73b73bed889] Statement id 0 f
rom transaction [ { id: UUID("dbcd2345-9052-4e7d-b12d-518b577191d6"), uid: BinData(0, E3B0C44298FC1C149AFBF4C8996FB924
27AE41E4649B934CA495991B7852B855) }:350208 ] was committed once with opTime { ts: Timestamp(1535310545, 1153), t: 23 }
 and a second time with opTime { ts: Timestamp(1534883468, 882), t: 20 }. This indicates possible data corruption or s
erver bug and the process will be terminated.
2018-08-26T19:34:32.619+0000 F -        [sessionCatalogMigration-rs0_rs2_5b8300c7f321a73b73bed889] Fatal Assertion 
40526 at src/mongo/db/session.cpp 67
2018-08-26T19:34:32.619+0000 F -        [sessionCatalogMigration-rs0_rs2_5b8300c7f321a73b73bed889] 
 
***aborting after fassert() failure
 
2018-08-26T19:34:32.636+0000 F -        [sessionCatalogMigration-rs0_rs2_5b8300c7f321a73b73bed889] Got signal: 6 (Aborted).
 
 0x5634687c23b1 0x5634687c15c9 0x5634687c1aad 0x7f639ebba390 0x7f639e814428 0x7f639e81602a 0x563466f1d1ae 0x5634674d661a 0x5634674dad9e 0x5634674dcbda 0x5634675cf9bd 0x5634688d1610 0x7f639ebb06ba 0x7f639e8e641d
----- BEGIN BACKTRACE -----
{"backtrace":[{"b":"563466589000","o":"22393B1","s":"_ZN5mongo15printStackTraceERSo"},{"b":"563466589000","o":"22385C9"},{"b":"563466589000","o":"2238AAD"},{"b":"7F639EBA9000","o":"11390"},{"b":"7F639E7DF000","o":"35428","s":"gsignal"},{"b":"7F639E7DF000","o":"3702A","s":"abort"},{"b":"563466589000","o":"9941AE","s":"_ZN5mongo32fassertFailedNoTraceWithLocationEiPKcj"},{"b":"563466589000","o":"F4D61A"},{"b":"563466589000","o":"F51D9E","s":"_ZN5mongo7Session26refreshFromStorageIfNeededEPNS_16OperationContextE"},{"b":"563466589000","o":"F53BDA","s":"_ZN5mongo14SessionCatalog18getOrCreateSessionEPNS_16OperationContextERKNS_16LogicalSessionIdE"},{"b":"563466589000","o":"10469BD","s":"_ZN5mongo34SessionCatalogMigrationDestination31_retrieveSessionStateFromSourceEPNS_14ServiceContextE"},{"b":"563466589000","o":"2348610"},{"b":"7F639EBA9000","o":"76BA"},{"b":"7F639E7DF000","o":"10741D","s":"clone"}],"processInfo":{ "mongodbVersion" : "3.6.7", "gitVersion" : "2628472127e9f1826e02c665c1d93880a204075e", "compiledModules" : [], "uname" : { "sysname" : "Linux", "release" : "4.4.0-1065-aws", "version" : "#75-Ubuntu SMP Fri Aug 10 11:14:32 UTC 2018", "machine" : "x86_64" }, "somap" : [ { "b" : "563466589000", "elfType" : 3, "buildId" : "5D78F445F57AA961C35C97316819BF42C1939FFF" }, { "b" : "7FFE739E2000", "elfType" : 3, "buildId" : "98B173804DDFA7204D8EC8829DB1D865B54DCD24" }, { "b" : "7F639FD9E000", "path" : "/lib/x86_64-linux-gnu/libresolv.so.2", "elfType" : 3, "buildId" : "6EF73266978476EF9F2FD2CF31E57F4597CB74F8" }, { "b" : "7F639F95A000", "path" : "/lib/x86_64-linux-gnu/libcrypto.so.1.0.0", "elfType" : 3, "buildId" : "250E875F74377DFC74DE48BF80CCB237BB4EFF1D" }, { "b" : "7F639F6F1000", "path" : "/lib/x86_64-linux-gnu/libssl.so.1.0.0", "elfType" : 3, "buildId" : "513282AC7EB386E2C0133FD9E1B6B8A0F38B047D" }, { "b" : "7F639F4ED000", "path" : "/lib/x86_64-linux-gnu/libdl.so.2", "elfType" : 3, "buildId" : "8CC8D0D119B142D839800BFF71FB71E73AEA7BD4" }, { "b" : "7F639F2E5000", "path" : "/lib/x86_64-linux-gnu/librt.so.1", "elfType" : 3, "buildId" : "89C34D7A182387D76D5CDA1F7718F5D58824DFB3" }, { "b" : "7F639EFDC000", "path" : "/lib/x86_64-linux-gnu/libm.so.6", "elfType" : 3, "buildId" : "DFB85DE42DAFFD09640C8FE377D572DE3E168920" }, { "b" : "7F639EDC6000", "path" : "/lib/x86_64-linux-gnu/libgcc_s.so.1", "elfType" : 3, "buildId" : "68220AE2C65D65C1B6AAA12FA6765A6EC2F5F434" }, { "b" : "7F639EBA9000", "path" : "/lib/x86_64-linux-gnu/libpthread.so.0", "elfType" : 3, "buildId" : "CE17E023542265FC11D9BC8F534BB4F070493D30" }, { "b" : "7F639E7DF000", "path" : "/lib/x86_64-linux-gnu/libc.so.6", "elfType" : 3, "buildId" : "B5381A457906D279073822A5CEB24C4BFEF94DDB" }, { "b" : "7F639FFB9000", "path" : "/lib64/ld-linux-x86-64.so.2", "elfType" : 3, "buildId" : "5D7B6259552275A3C17BD4C3FD05F5A6BF40CAA5" } ] }}
 mongod(_ZN5mongo15printStackTraceERSo+0x41) [0x5634687c23b1]
 mongod(+0x22385C9) [0x5634687c15c9]
mongod(+0x22385C9) [0x5634687c15c9]
 mongod(+0x2238AAD) [0x5634687c1aad]
 libpthread.so.0(+0x11390) [0x7f639ebba390]
 libc.so.6(gsignal+0x38) [0x7f639e814428]
 libc.so.6(abort+0x16A) [0x7f639e81602a]
 mongod(_ZN5mongo32fassertFailedNoTraceWithLocationEiPKcj+0x0) [0x563466f1d1ae]
 mongod(+0xF4D61A) [0x5634674d661a]
 mongod(_ZN5mongo7Session26refreshFromStorageIfNeededEPNS_16OperationContextE+0x12FE) [0x5634674dad9e]
 mongod(_ZN5mongo14SessionCatalog18getOrCreateSessionEPNS_16OperationContextERKNS_16LogicalSessionIdE+0xDA) [0x5634674dcbda]
 mongod(_ZN5mongo34SessionCatalogMigrationDestination31_retrieveSessionStateFromSourceEPNS_14ServiceContextE+0x100D) [0x5634675cf9bd]
 mongod(+0x2348610) [0x5634688d1610]
 libpthread.so.0(+0x76BA) [0x7f639ebb06ba]
 libc.so.6(clone+0x6D) [0x7f639e8e641d]
-----  END BACKTRACE  -----



 Comments   
Comment by Kelsey Schubert [ 20/Mar/19 ]

Hi zhicheng,

I've looked the logs you've provided. It appears that you're encountering SERVER-39847, which will be included in the next release of MongoDB. There is a currently a release candidate available for download, MongoDB 4.0.7-rc1, and I expect GA version to be available within a week. Until you're able to upgrade, I would suggest disabling the balancer.

Going forward, it's generally best to open a new ticket so we can track issue appropriately.

Kind regards,
Kelsey

Comment by Zhicheng Long [ 20/Mar/19 ]

Hi kelsey.schubert ,

Appreciate all the efforts on resolving this issue.

We are currently using MongoDB 4.0.5 but seems same issue still occurs. Below I attach the log message. Is it due to same cause or another bug?

mongo_log.txt

 

Comment by Kelsey Schubert [ 31/Oct/18 ]

Hi epkugelmass,

Thanks for the additional information. I'm reopening this ticket as we've identified root cause issue and the fixes will be in the next release of MongoDB (3.6.9 and 4.0.4). If you would like to reenable retryable writes, please be sure to upgrade first to take advantage of these fixes.

For others encountering this issue, I would recommend first disabling the balancer as a workaround. As this issue requires both retryable writes and active migrations to trigger.

Kind regards,
Kelsey

Comment by Elan Kugelmass [ 26/Oct/18 ]

Hi Kelsey,

We turned off retryable writes and this problem stopped occurring. It's a reasonable workaround for our use case, so we're going to stop digging at the actual problem.

Elan

Comment by Kelsey Schubert [ 25/Oct/18 ]

Hi epkugelmass,

We haven’t heard back from you for some time, so I’m going to mark this ticket as resolved. If this is still an issue for you, please provide additional information and we will reopen the ticket.

Regards,
Kelsey

Comment by Nick Brewer [ 17/Sep/18 ]

epkugelmass Sounds good, thanks.

-Nick

Comment by Elan Kugelmass [ 17/Sep/18 ]

Hi Nick,

I'm taking this ticket back over. I need some time to generate the data you've requested. Should be no more than a couple of days.

Elan

Comment by Nick Brewer [ 12/Sep/18 ]

vkreddy Unfortunately we can't match these oplog entries up with the occurrence that was described, as those oplog entries have since been removed. However if you've experienced the issue on this node since then, we may be able to match it up to a time in the oplog data that we have available; in this case it would also be useful to see the mongod log from that time.

Thanks,
-Nick

Comment by Karthik Reddy Vadde [ 10/Sep/18 ]

Sorry about the delay. Thanks for the commands to retrieve oplog. I have uploaded the oplog and diagnostics data to the secure portal. Please let me know if you need more information.

Comment by Nick Brewer [ 30/Aug/18 ]

vkreddy I apologize, exporting a view in this way isn't going to work on a sharded cluster. However there's another way we can sanitize the oplog information before exporting it:
From the local database on the mongod, use aggregation to create a new collection from the oplog.rs collection, with the desired field removed:

db.oplog.rs.aggregate([ { $project: {"o": 0}}, {$out: "redacted"}])

Move the collection from the "local" database to a new database (note that this needs to be run on the mongod, from the "admin" database):

db.runCommand({renameCollection: 'local.redacted', to: 'test.redacted'})

Then run mongodump against the mongos, specifying something like:

mongodump -h mongos-address:27017 -d test -c redacted -o redacted_oplog_dump

Thanks,
-Nick

Comment by Karthik Reddy Vadde [ 28/Aug/18 ]

Nick, 

I am Elan's colleague and I am working with him on sharding mongo. I created the view and tried to get the dump as suggested using the

mongodump -h <hostname_of_toshard_primary>:27018 -d local -c redactedView -o oplog_dump --viewsAsCollections

However I ran to following error:

2018-08-28T17:56:27.969+0000 writing local.redactedView to

2018-08-28T17:56:27.969+0000 Failed: error reading from db: Command on view must be executed by mongos.

We are not sure how to proceed. Appreciate your help on this.

Comment by Nick Brewer [ 27/Aug/18 ]

epkugelmass You can create a view that redacts the "o" field:

use local
 
db.createView(
    "redactedView",
    "oplog.rs",
    [ { $project: { "o": 0 } } ]
)

This creates a new redactedView collection with the specified fields removed - you can then perform the mongodump on this collection instead.

The diagnostic.data directory does not contain database/collection fields or details, it merely collects performance information that can be used by our internal tooling to identify issues.

-Nick

Comment by Elan Kugelmass [ 27/Aug/18 ]

Hi nick.brewer, I'd be happy to get you that data. Do you have any suggestions on a way to redact pieces of the oplog? I see an option in mongodump to add a filter/where clause, but not one to do a projection. I need to remove "o":

{my employer's data}

before I can share the file.

Can you also tell me what's collected in diagnostic.data?

Quick update on the issue: This is now happening around once an hour (might correspond to a cyclical increase in application activity for us). We've turned off balancing as a result. We're considering turning off retryable writes and then seeing what happens.

Comment by Nick Brewer [ 27/Aug/18 ]

epkugelmass We'd like to see the oplog data for the crashing to-shard. If you could run the following command, and supply us with the outputted file:

mongodump -h <hostname>:<port> -d local -c oplog.rs -o oplog_dump

Additionally, if you could archive (tar or zip) the dbpath/diagnostic.data directory for this mongod, along with the log file from the time this behavior occurs, and upload them as well.

You can upload this information to our secure portal. Information shared there is only available to MongoDB employees, and is automatically removed after a period of time.

Thanks,
Nick

Generated at Thu Feb 08 04:44:20 UTC 2024 using Jira 9.7.1#970001-sha1:2222b88b221c4928ef0de3161136cc90c8356a66.