Uploaded image for project: 'Java Driver'
  1. Java Driver
  2. JAVA-1125

Improve GridFS.remove(query) method

    XMLWordPrintableJSON

Details

    • Icon: Improvement Improvement
    • Resolution: Won't Fix
    • Icon: Major - P3 Major - P3
    • None
    • 2.12.0
    • GridFS
    • None

    Description

      remove(query) on GridFS is currently performed by :

      • first issuing a select on files bucket
      • then for each file remove it and remove the associated chunks by files_id

      First I think this kind of linked removal, should be fully handled by the server and not by the client, as this is a server feature.
      Moreover the current implementation can result in thousands of different requests, and doesn't insure consistency anyway.

      Thus the method efficiency could be improved by performing at most two requests, one on "files" collection using the query and the other on "chunks" collection using a in clause on files_id previously selected.

      Tests made on a "50K files" bucket have showed that the remove time for 31K files was dropping from 385000ms to 825ms only (465x improvement)

      PR #171 has been issued on github : https://github.com/mongodb/mongo-java-driver/pull/171

      Attachments

        Activity

          People

            Unassigned Unassigned
            finalspy PETIT Yann
            Votes:
            1 Vote for this issue
            Watchers:
            3 Start watching this issue

            Dates

              Created:
              Updated:
              Resolved: