Let's take a look at what happens in the following scenario:
- We start a refresh. We are going to assume that this refresh is going to be able to find the metadata associated to a collection.
- A drop collection is executed while the 1st refresh is running, triggering a second refresh. This refresh goes through the path of getCollectionRoutingInfoWithRefresh.
Let's take a look at the following interleaving:
- 1st refresh saw the collection on the config server and let's assume that it is executing something before reaching this statement.
- then, the 2nd refresh advances the time in store, increasing the the forced refresh static atomic by two. However, even if the atomic was increased by two, the new time in store has a forced refresh value of +1. The idea is that all previous ComparableChunkVersions will be older than the recently created but all posterior ComparableChunkVersions will be newer. To simplify we will assume that the original value of the atomic was 4, after the creating the ComparableChunkVersion is 6 and internally this new object is holding a 5.
- then, 2nd refresh finds out that there is a refresh ongoing, so it adds itself to the waiter list, with minTimeStore = 5.
- then 1st refresh resumes its execution, creating a new ComparableChunkVersion with forced refresh value = 6. Note that this refresh didn't see the drop collection.
- Finally, the 1st refresh finishes and fulfills all promises of the pending refreshes whose minTimeInStore are older than the new Time, so we end up fulfilling the second refresh because its minTimeInStore=5 is older than the new time. Thus, the second refresh that was triggered because of the drop operation did find the old collection.
The proposed solution from tommaso.tocci is capturing the values of the atomics associated to the ComparableChunkVersion before doing the query to the loader and using those values to create the new ComparableChunkVersion.