Details
-
Bug
-
Resolution: Done
-
Major - P3
-
None
-
2.5.0
-
None
-
Sharding
-
ALL
Description
Note - the correct behavior here may need more discussion.
If, while attempting to write the version in the critical section of a migrate, the write fails on the first (or second) server, mongod can throw an exception that is caught by the critical section recovery code (see DBClientInterface::findN, looks like the SCC::findOne does not expect this to happen). Writes to subsequent servers are not performed, potentially resulting in inconsistency.
The check afterwards (which reads the version written and shuts down the server if it differs) does not catch the problem if the write to the server actually went through. This happens in the case of config server timeout, for example.