Uploaded image for project: 'Core Server'
  1. Core Server
  2. SERVER-39355

Collection drops can block the server for long periods

    XMLWordPrintable

    Details

    • Type: Bug
    • Status: Closed
    • Priority: Major - P3
    • Resolution: Done
    • Affects Version/s: 3.4.14
    • Fix Version/s: None
    • Component/s: Storage
    • Labels:
      None
    • Operating System:
      ALL
    • Sprint:
      Storage Engines 2019-02-25
    • Story Points:
      1

      Description

      Hi, sorry but we've just had another occurrence today (still running 3.4.13) so there's still an issue here. We've modified our code to drop collection to sleep 10 sec between each deletion (to give mongo some time to recover after the "short" global lock and not kill the platform) but unfortunately this wasn't enough and it killed the global performance:

      After investigation I found that this was cause by some collection deletion. I tried to upload the diagnostic.data but the portal specified earlier doesn't accept files any more. I can upload it if you give another portal.

      Here is the log from the drop queries: mongo_drop_log.txt, we can see here that they are spaced by 10sec (+drop duration) and that the drop take A LOT of time (all these collections were empty or had 5 records at most). They had some indexes though, which are not shown here but probably had to be destroyed at the same time. I don't know if it's a checkpoint global lock issue again but it's definitely still not possible to drop collection in a big 3.4.13 mongo without killing it. For the record we have ~40k namespaces, this has not changed much since the db.stats I reported above.

      And before you say this is probably fixed in a more recent version, we'll need better proof than last time considering the high risk of upgrading...

        Attachments

          Issue Links

            Activity

              People

              • Votes:
                0 Vote for this issue
                Watchers:
                9 Start watching this issue

                Dates

                • Created:
                  Updated:
                  Resolved: