Read partitioner does not fallback to singlePartitioner when empty collection

XMLWordPrintableJSON

    • Type: Bug
    • Resolution: Unresolved
    • Priority: Unknown
    • 10.6.1, 11.0.1
    • Affects Version/s: 10.6.0, 11.0.0
    • Component/s: Partitioners
    • None
    • Java Drivers
    • Not Needed
    • None
    • None
    • None
    • None
    • None
    • None

       

      com.mongodb.spark.sql.connector.exceptions.MongoSparkException: Partitioning failed. Command execution failed on MongoDB server with error 28747 (Location28747): 'size argument to $sample must be a positive integer' on server localhost:27017. The full response is {"ok": 0.0, "errmsg": "size argument to $sample must be a positive integer", "code": 28747, "codeName": "Location28747"}

      We're having an issue where the read partitioner (we tested AutoBucket and Sample) does not fallback to single partitioner when we're trying to read an empty collection.
      While debugging, we find out that the following if statement evaluates to false: 

       if (numDocumentsPerPartition == 0 || numDocumentsPerPartition >= count)

       (https://github.com/mongodb/mongo-spark/blob/main/src/main/java/com/mongodb/spark/sql/connector/read/partitioner/AutoBucketPartitioner.java#L179C5-L179C78)

      In our case, count=0 and numDocumentsPerPartition is NaN.
      Is there a reason why this check does not include a condition like count == 0 ?

       

       

            Assignee:
            Ross Lawley
            Reporter:
            Fabien LE BEC
            Almas Abdrazak
            Votes:
            0 Vote for this issue
            Watchers:
            3 Start watching this issue

              Created:
              Updated: