[SERVER-35222] Crash on the config server at expired session cleanup Created: 25/May/18 Updated: 29/Oct/23 Resolved: 28/Aug/18 |
|
| Status: | Closed |
| Project: | Core Server |
| Component/s: | Sharding |
| Affects Version/s: | 3.6.0, 4.0.0 |
| Fix Version/s: | 3.6.9, 4.0.3, 4.1.3 |
| Type: | Bug | Priority: | Major - P3 |
| Reporter: | PARK-MinSoo [X] | Assignee: | Randolph Tan |
| Resolution: | Fixed | Votes: | 0 |
| Labels: | None | ||
| Remaining Estimate: | Not Specified | ||
| Time Spent: | Not Specified | ||
| Original Estimate: | Not Specified | ||
| Attachments: |
|
||||||||||||||||||||
| Issue Links: |
|
||||||||||||||||||||
| Backwards Compatibility: | Fully Compatible | ||||||||||||||||||||
| Operating System: | ALL | ||||||||||||||||||||
| Backport Requested: |
v4.0, v3.6
|
||||||||||||||||||||
| Sprint: | Sharding 2018-06-18, Sharding 2018-07-16, Sharding 2018-07-30, Sharding 2018-08-13, Sharding 2018-09-10 | ||||||||||||||||||||
| Participants: | |||||||||||||||||||||
| Linked BF Score: | 52 | ||||||||||||||||||||
| Description |
|
Hello.. Mongodb server In operation. After that, The below stackdump occurred. please thanks.
mongod version : 3.6.0 os : CentOS Linux 7.3.1611 |
| Comments |
| Comment by Park YoungSoo [ 05/Apr/19 ] |
|
Hi Did the fix come to the version listed above? (3.6.9, 4.0.3, 4.1.3)
Do you have any explanation of the cause of the bug?
Thank you. |
| Comment by Githook User [ 19/Sep/18 ] |
|
Author: {'name': 'Randolph Tan', 'email': 'randolph@10gen.com', 'username': 'renctan'}Message: (cherry picked from commit ce0602665adb7ec7d241dd77e585f7907e405e84) |
| Comment by Githook User [ 13/Sep/18 ] |
|
Author: {'name': 'Randolph Tan', 'email': 'randolph@10gen.com', 'username': 'renctan'}Message: (cherry picked from commit ce0602665adb7ec7d241dd77e585f7907e405e84) |
| Comment by Githook User [ 28/Aug/18 ] |
|
Author: {'name': 'Randolph Tan', 'email': 'randolph@10gen.com', 'username': 'renctan'}Message: |
| Comment by Kaloian Manassiev [ 28/Jun/18 ] |
|
It doesn't seem right that the reaper code should be using separate code paths for whether it is going against itself versus against a shard. jack.mulrow/renctan, can you guys please figure out if there is a cleaner way to solve this? |
| Comment by Misha Tyulenev [ 25/May/18 ] |
|
Hey kaloian.manassiev. ShardRegistry keeps shards built by ShardFactory according to their ConnectionString type . If its local it will build ShardLocal that has no targeter or ReplicaSetMonitor so the invariant that was hit is expected. This all boils down to the initialization code which for config servers createa corresponding shards as local |
| Comment by Kaloian Manassiev [ 25/May/18 ] |
|
Hi misha.tyulenev, From the call stack I think this might be happening when there is a session created against the config server (e.g., customer write with a session against the config/admin databases). Then these sessions expire and when the reaper goes to clean them up, it ends up using the write commands code, which does targeting by calling Shard::getTargeter and this is not allowed to be called on the config server. -Kal. |
| Comment by Kaloian Manassiev [ 25/May/18 ] |
|
Hi TheCoin, Thank you very much for your report! In your comment above you mention that you might have a backup of the cluster from before the crash. Would it be possible to upload the part which contains the config server's data? It is not a big deal if you don't, but it might help us diagnose this issue faster. Also, does your application by any chance write to the config or admin databases of the cluster? Thank you in advance. -Kal. |
| Comment by PARK-MinSoo [X] [ 25/May/18 ] |
|
Unfortunately, the entire log file does not remain. so. Crash Time log will upload the file as requested.
And, mongos log <Refresh for collection config.system.sessions took 0 and found the collection is not sharded> It has continued to happen.
Information about config.system.session is missing from config.collections. So, db.chunks.remove({_id:"config.system.sessions-_id_MinKey"}) db.runCommand({shardcollection:"config.system.sessions",key:{_id:"hashed"}}) after config.system.seeison Re sharding. Did not happen 'not sharded' log
Thank you for your help, Park-Minsoo |
| Comment by Kelsey Schubert [ 25/May/18 ] |
|
Hi TheCoin, So we can continue to investigate, would you please upload the complete log file of the affected node? I'd like to see what was happening before the mongod crashed. Thank you for your help, |