Consider improving critical section timeout logic

XMLWordPrintableJSON

    • Cluster Scalability
    • None
    • None
    • None
    • None
    • None
    • None
    • None

      The current coordinator logic considers includes the following step in the critical section:

      1. Telling donor to block writes.
      2. Wait for all recipients to transition to strict consistency.
      3. Tell all participants to commit.
      4. The donors-only participant drop the collection.
      5. The donor/recipient participants rename the collection to the targaet collection.
      6. All participants do some cleanup to remove/drop documents/collections that are related to resharding.
      7. Coordinator waits for cleanup to finish.
      8. Coordinator does it's own cleanup.
      9. Coordinator cancels the critical section timeout.

      In reality, the critical section was already over the moment the recipients finished renaming the collections. We could make the participants report to the coordinator that it is "done" earlier and make the coordinator end the critical section before it starts doing it's own cleanup.

            Assignee:
            Unassigned
            Reporter:
            Randolph Tan
            Votes:
            0 Vote for this issue
            Watchers:
            3 Start watching this issue

              Created:
              Updated: