Priority: Major - P3
Affects Version/s: None
Fix Version/s: None
We are running a MongoDB instance on Azure VM with default settings. We notice that Azure VM tends to close socket connection if it's not active in several minutes. When we are trying to initial sync from MongoDB on Azure VM to another replica set member, syncing always fails because the connection will be dropped when there's no network traffic for several minutes (e.g., when the startup instance is building an index), and initial sync will start all over.
A sample log snippet:
[building index here...]
2017-08-17T19:06:30.632+0800 I NETWORK [rsSync] Socket recv() errno:10053 An established connection was aborted by the software in your host machine. [***.***.***.***:*****]
2017-08-17T19:06:30.632+0800 I NETWORK [rsSync] SocketException: remote: (NONE):0 error: 9001 socket exception [RECV_ERROR] server [***.***.***.***:*****]
2017-08-17T19:06:30.632+0800 I NETWORK [rsSync] DBClientCursor::init call() failed
2017-08-17T19:06:30.640+0800 E REPL [rsSync] 13386 socket error for mapping query
2017-08-17T19:06:30.640+0800 E REPL [rsSync] initial sync attempt failed, 9 attempts remaining
2017-08-17T19:06:35.641+0800 I REPL [rsSync] initial sync pending
2017-08-17T19:06:35.643+0800 I REPL [ReplicationExecutor] syncing from: <HOSTNAME>:*****
2017-08-17T19:06:36.454+0800 I REPL [rsSync] initial sync drop all databases
2017-08-17T19:06:36.454+0800 I STORAGE [rsSync] dropAllDatabasesExceptLocal 14
2017-08-17T19:06:43.928+0800 I REPL [rsSync] initial sync clone all databases
For MongoDB client, this can be resolved by set MaxConnectionIdleTime, but it seems there's no way to configure the same for replica sets, and hence Azure users (if not tweaking OS settings) will find it hard to sync data to another replica set out of Azure VM.
Can we have an option to either specify max connection time for replica set, or make the initial sync not fail completely on a single connection failure?