Loading...

XML

Word

Printable

JSON

Type: Bug
Resolution: Done
Priority: Major - P3
Fix Version/s: None
Affects Version/s: None
Component/s: Sharding
Labels:
None

Assigned Teams:

Sharding
Operating System:
ALL
Confidence Status:
None
Work Order:
3
CAR Domain/s:
None

Aha! Reference:
None
Tracking Level:
None
Risk Status:
None
Exec Notes:
None
Goal Name(s):
None
Goal Link:
None

It's possible to get into a hung state in mongos given a sharded cluster with a replica set shard. If the shard primary continues to allow connections but does not respond to other requests, failover will occur and a new primary will be elected normally, but sharded queries via mongos will block and not return.

This failure mode has been observed on EC2.

Can reproduce locally by :
1) Running the script included (sets up a sharded cluster with sharded collection)
2) Running mongo localhost:31000 to connect to the mongos
3) > use foo
4) > db.bar.find().itcount()
5) Stopping all data transfer for connections at the primary via iptables :
sudo /sbin/iptables -A INPUT -i lo -p tcp -m tcp --dport <mongod primary port> -m conntrack --ctstate ESTABLISHED -j DROP
6) (wait for failover)
7) > db.bar.find().itcount()

- - Sort By Name
  - Sort By Date
  - Ascending
  - Descending
  - Thumbnails
  - List
  - Download All

hang_setup.js
0.3 kB
Oct 18 2011 03:01:07 PM UTC

is related to

SERVER-4661 Mongos doesn't detect primary change if old primary lost network connectivity

Closed

related to

SERVER-7862 Connection timeouts in mongos

Closed

Assignee:: [DO NOT USE] Backlog - Sharding Team
Reporter:: Greg Studer (Inactive)
Participants:: [DO NOT USE] Backlog - Sharding Team, Eric Milkie, Greg Studer, Ratika Gandhi
Votes:: 4 Vote for this issue
Watchers:: 8 Start watching this issue

Created:: Oct 18 2011 03:01:07 PM UTC
Updated:: Dec 06 2022 05:39:56 AM UTC
Resolved:: May 31 2019 06:26:40 PM UTC

Details

Description

Attachments

Attachments

Issue Links

Activity

People

Dates