-
Type: Bug
-
Resolution: Fixed
-
Priority: Major - P3
-
Affects Version/s: 3.5.10
-
Component/s: Sharding
-
Fully Compatible
-
Sharding 2018-07-30
-
0
At the end of moveChunk, when deleting the orphaned range, if deletion fails, the failure is reported differently depending on when it is noticed. If the deletion fails immediately, it will produce a "warning" in the log, and moveChunk will return normally. If it fails while moveChunk waits for deletion to complete, moveChunk will throw and catch an exception and conduct a "Severe error occurred while running moveChunk command" in the response to the moveChunk command. Finally, if it fails after moveChunk has returned, the failure may be reported solely in background as a normal log entry, and probably again, later, on a manual cleanupOrphans command or when a chunk is being moved into the vacated range..
(The code in question has moved to db/s/migration_source_manager.cpp.) Probably what should change is the case when deletion fails immediately; it probably should be reported as a failure of the command, similarly to the waitForDelete case, even though the chunk has successfully moved; but wrapped in a message explaining that the move was successful and the failure is only about cleanup. The waitForDelete case should report later deletion failure the same way, and not (as now) as a failure of the whole operation.