-
Type:
Bug
-
Resolution: Duplicate
-
Priority:
Major - P3
-
None
-
Affects Version/s: 4.0.1
-
Component/s: Sharding
-
None
-
Sharding
-
ALL
-
None
-
None
-
None
-
None
-
None
-
None
-
None
In a sharded cluster, a long running query can cause a shard to refresh the routing table history multiple times. If the sharded cluster is very large, this routing table history can take up a large amount of space and eventually lead to an OOM.
Here is a snapshot of call stacks that show 6.5 GB being used solely to update the routing table history.
Here is the balancer information:
balancer:
Currently enabled: yes
Currently running: yes
Collections with active migrations:
buildlogs.logs started at Tue Jan 08 2019 21:57:37 GMT+0000 (UTC)
Failed balancer rounds in last 5 attempts: 0
Migration Results for the last 24 hours:
2990 : Success
1 : Failed with error 'aborted', from logkeeperdb-shard_26 to logkeeperdb-shard_24
1 : Failed with error 'aborted', from logkeeperdb-shard_26 to logkeeperdb-shard_21
2 : Failed with error 'aborted', from logkeeperdb-shard_14 to logkeeperdb-rs0
1 : Failed with error 'aborted', from logkeeperdb-shard_9 to logkeeperdb-shard_18
1 : Failed with error 'aborted', from logkeeperdb-shard_15 to logkeeperdb-shard_21
1 : Failed with error 'aborted', from logkeeperdb-shard_15 to logkeeperdb-shard_17
1 : Failed with error 'aborted', from logkeeperdb-shard_15 to logkeeperdb-shard_22
1 : Failed with error 'aborted', from logkeeperdb-shard_17 to logkeeperdb-shard_12
1 : Failed with error 'aborted', from logkeeperdb-shard_14 to logkeeperdb-shard_4
1 : Failed with error 'aborted', from logkeeperdb-shard_22 to logkeeperdb-shard_13
1 : Failed with error 'aborted', from logkeeperdb-shard_13 to logkeeperdb-shard_8
1 : Failed with error 'aborted', from logkeeperdb-shard_17 to logkeeperdb-shard_20
mongos> db.chunks.find({ns: "buildlogs.logs"}).count()
1303476
- duplicates
-
SERVER-36443 Long-running queries should not cause a build-up of unused ChunkManager objects
-
- Closed
-