Uploaded image for project: 'Spark Connector'
  1. Spark Connector
  2. SPARK-217

ObjectId UDF does not cause filter pushdown

    • Type: Icon: Improvement Improvement
    • Resolution: Won't Fix
    • Priority: Icon: Major - P3 Major - P3
    • None
    • Affects Version/s: None
    • Component/s: None
    • Labels:
      None

      If I use Mongo Spark builtin UDF to use ObjectId in a where clause, the condition does not get pushed down to RDD filters:

      Example query: 

      SELECT * FROM customers WHERE _id = ObjectId('4f398939e4b0d1716f5a7e2f')

      If I check it with explain:

      EXPLAIN SELECT * FROM customers WHERE _id = ObjectId('4f398939e4b0d1716f5a7e2f')

      I see following along the lines:

      PushedFilters: [IsNotNull(_id)]

      This is because the UDF does not implement foldable and therefore cannot be statically calculated and pushed to the PushedFilters. ScalaUDF extends Expression which has the method foldable which explains it a bit.

            Assignee:
            Unassigned Unassigned
            Reporter:
            hkroger Hannu Kröger
            Votes:
            0 Vote for this issue
            Watchers:
            2 Start watching this issue

              Created:
              Updated:
              Resolved: