Uploaded image for project: 'Spark Connector'
  1. Spark Connector
  2. SPARK-64

Sampling then projecting in the MongoSamplePartitioner is slow

    XMLWordPrintableJSON

Details

    • Improvement
    • Status: Closed
    • Major - P3
    • Resolution: Won't Fix
    • None
    • 1.0.0
    • Performance
    • None

    Description

      For example with the MovieLens dataset ~1million documents:

      Pipeline: sample, project _id: 76120 ms
      Pipeline: project _id, sample: 1124 ms

      Attachments

        Issue Links

          Activity

            People

              ross@mongodb.com Ross Lawley
              ross@mongodb.com Ross Lawley
              Votes:
              0 Vote for this issue
              Watchers:
              1 Start watching this issue

              Dates

                Created:
                Updated:
                Resolved: