[SERVER-53709] MongoDB Time-Spatial Query is slower with 2dsphere Index Created: 12/Jan/21  Updated: 15/Jan/21  Resolved: 15/Jan/21

Status: Closed
Project: Core Server
Component/s: None
Affects Version/s: 4.4.3
Fix Version/s: None

Type: Improvement Priority: Major - P3
Reporter: Stavros Koureas Assignee: Edwin Zhou
Resolution: Duplicate Votes: 0
Labels: None
Remaining Estimate: Not Specified
Time Spent: Not Specified
Original Estimate: Not Specified

Attachments: HTML File QUERY     HTML File QUERY2     HTML File QUERY3    
Issue Links:
Duplicate
duplicates SERVER-13065 Consider a collection scan even if in... Backlog
Participants:

 Description   

Recently, I started to investigate the performance of MongoDB with AIS Data. I used a collection with 19m documents with proper field types as described in the definition. I also created a new geoloc field with type: (Point) from coordinates (lon,lat) in this same collection.

The query under investigation is:

db.nari_dynamic.explain("executionStats").aggregate(
  [
    {
      $match: {
        geoloc: {
          $geoWithin: {
            $geometry: {
              type: "Polygon",
              coordinates: [
                [
                  [-5.0, 45.0],
                  [+0.0, 45.0],
                  [+0.0, 50.0],
                  [-5.0, 50.0],
                  [-5.0, 45.0],
                ],
              ],
            },
          },
        },
      },
    },
 
    {
      $group: {
        _id: "$sourcemmsi",
        PointCount: { $sum: 1 },
 
        MinDatePoint: { $min: { date: "$t3" } },
        MaxDatePoint: { $max: { date: "$t3" } },
      },
    },
    { $sort: { _id: 1 } },
    { $limit: 100 },
    { $project: { _id: 1, PointCount: 1, MinDatePoint: 1, MaxDatePoint: 1 } },
  ],
  { explain: true }
);

During investigation and testing I found the following:

{{}}

  1. Without any index: 94s
  2. With geoloc-2dsphere index: 280s

Of course, I understand that is more complex as the query has a grouping function, but the idea is that usually, we will get something quicker and not slower with the index unless the index causes a different sorting inside the engine like near does.

{{}}



 Comments   
Comment by Edwin Zhou [ 15/Jan/21 ]

Hi koureasstavros@gmail.com,

Thanks for your ticket on slow queries. After some investigation, it appears that the issue lies in the non-selective query. With a 2dsphere index specified, the query returns nearly all of the documents––18.3 million, or 95% of the documents are returned. The both plans will scan through 18.3 million documents, but the indexed plan incurs the additional cost of scanning through another 18.3 million keys.

executionStats COLLSCAN

{
  "queryPlanner": {
    "plannerVersion": 1,
    "namespace": "test.nari_dynamic",
    "indexFilterSet": false,
    "parsedQuery": {
      "geoloc": {
        "$geoWithin": {
          "$geometry": {
            "type": "Polygon",
            "coordinates": [
              [
                [-5, 45],
                [0, 45],
                [0, 50],
                [-5, 50],
                [-5, 45]
              ]
            ]
          }
        }
      }
    },
    "optimizedPipeline": true,
    "winningPlan": {
      "stage": "COLLSCAN",
      "filter": {
        "geoloc": {
          "$geoWithin": {
            "$geometry": {
              "type": "Polygon",
              "coordinates": [
                [
                  [-5, 45],
                  [0, 45],
                  [0, 50],
                  [-5, 50],
                  [-5, 45]
                ]
              ]
            }
          }
        }
      },
      "direction": "forward"
    },
    "rejectedPlans": []
  },
  "executionStats": {
    "executionSuccess": true,
    "nReturned": 18254550,
    "executionTimeMillis": 36339,
    "totalKeysExamined": 0,
    "totalDocsExamined": 19035630,
    "executionStages": {
      "stage": "COLLSCAN",
      "filter": {
        "geoloc": {
          "$geoWithin": {
            "$geometry": {
              "type": "Polygon",
              "coordinates": [
                [
                  [-5, 45],
                  [0, 45],
                  [0, 50],
                  [-5, 50],
                  [-5, 45]
                ]
              ]
            }
          }
        }
      },
      "nReturned": 18254550,
      "executionTimeMillisEstimate": 3961,
      "works": 19035632,
      "advanced": 18254550,
      "needTime": 781081,
      "needYield": 0,
      "saveState": 19035,
      "restoreState": 19035,
      "isEOF": 1,
      "direction": "forward",
      "docsExamined": 19035630
    }
  },
  "serverInfo": {
    "host": "Edwins-MBP.cable.rcn.com",
    "port": 27017,
    "version": "4.4.1",
    "gitVersion": "ad91a93a5a31e175f5cbf8c69561e788bbc55ce1"
  },
  "ok": 1
}

executionStats IXSCAN

{
  "queryPlanner": {
    "plannerVersion": 1,
    "namespace": "test.nari_dynamic",
    "indexFilterSet": false,
    "parsedQuery": {
      "geoloc": {
        "$geoWithin": {
          "$geometry": {
            "type": "Polygon",
            "coordinates": [
              [
                [-5, 45],
                [0, 45],
                [0, 50],
                [-5, 50],
                [-5, 45]
              ]
            ]
          }
        }
      }
    },
    "optimizedPipeline": true,
    "winningPlan": {
      "stage": "FETCH",
      "filter": {
        "geoloc": {
          "$geoWithin": {
            "$geometry": {
              "type": "Polygon",
              "coordinates": [
                [
                  [-5, 45],
                  [0, 45],
                  [0, 50],
                  [-5, 50],
                  [-5, 45]
                ]
              ]
            }
          }
        }
      },
      "inputStage": {
        "stage": "IXSCAN",
        "keyPattern": {
          "geoloc": "2dsphere"
        },
        "indexName": "geoloc_2dsphere",
        "isMultiKey": false,
        "multiKeyPaths": {
          "geoloc": []
        },
        "isUnique": false,
        "isSparse": false,
        "isPartial": false,
        "indexVersion": 2,
        "direction": "forward",
        "indexBounds": {
          "geoloc": [
            "[936748722493063168, 936748722493063168]",
            "[954763121002545152, 954763121002545152]",
            "[959266720629915648, 959266720629915648]",
            "[960392620536758272, 960392620536758272]",
            "[960674095513468928, 960674095513468928]",
            "[960744464257646592, 960744464257646592]",
            "[960762056443691008, 960762056443691008]",
            "[960766454490202112, 960766454490202112]",
            "[960767554001829888, 960767554001829888]",
            "[960767828879736832, 960767828879736832]",
            "[960767897599213568, 960767897599213568]",
            "[960767914779082752, 960767914779082752]",
            "[960767919074050048, 960767919074050048]",
            "[960767920147791872, 960767920147791872]",
            "[960767920416227328, 960767920416227328]",
            "[960767920483336192, 960767920483336192]",
            "[960767920500113408, 960767920500113408]",
            "[960767920504307712, 960767920504307712]",
            "[960767920505356288, 960767920505356288]",
            "[960767920505618432, 960767920505618432]",
            "[960767920505683968, 960767920505683968]",
            "[960767920505683969, 960767920505716735]",
            "[1345075088707977217, 1345075088708009983]",
            "[1345075088708009984, 1345075088708009984]",
            "[1345075088708075520, 1345075088708075520]",
            "[1345075088708337664, 1345075088708337664]",
            "[1345075088709386240, 1345075088709386240]",
            "[1345075088713580544, 1345075088713580544]",
            "[1345075088730357760, 1345075088730357760]",
            "[1345075088797466624, 1345075088797466624]",
            "[1345075089065902080, 1345075089065902080]",
            "[1345075090139643904, 1345075090139643904]",
            "[1345075094434611200, 1345075094434611200]",
            "[1345075111614480384, 1345075111614480384]",
            "[1345075180333957120, 1345075180333957120]",
            "[1345075455211864064, 1345075455211864064]",
            "[1345076554723491840, 1345076554723491840]",
            "[1345080952770002944, 1345080952770002944]",
            "[1345098544956047360, 1345098544956047360]",
            "[1345168913700225024, 1345168913700225024]",
            "[1345450388676935680, 1345450388676935680]",
            "[1346576288583778304, 1346576288583778304]",
            "[1351079888211148800, 1351079888211148800]",
            "[1369094286720630784, 1369094286720630784]",
            "[5116089176692883456, 5116089176692883456]",
            "[5170132372221329408, 5170132372221329408]",
            "[5179139571476070401, 5179702521429491711]",
            "[5179702521429491713, 5180265471382913023]",
            "[5180265471382913024, 5180265471382913024]",
            "[5183643171103440896, 5183643171103440896]",
            "[5187020870823968768, 5187020870823968768]",
            "[5187020870823968769, 5187583820777390079]",
            "[5187583820777390081, 5188146770730811391]",
            "[5188146770730811393, 5197153969985552383]",
            "[5206161169240293376, 5206161169240293376]",
            "[5218264593238851584, 5218264593238851584]",
            "[5218264593238851585, 5218405330727206911]",
            "[5218546068215562240, 5218546068215562240]",
            "[5218546068215562241, 5219109018168983551]",
            "[5219671968122404864, 5219671968122404864]",
            "[5220234918075826177, 5220797868029247487]",
            "[5220797868029247488, 5220797868029247488]",
            "[5220938605517602817, 5221079343005958143]",
            "[5221079343005958144, 5221079343005958144]",
            "[5260204364768739328, 5260204364768739328]"
          ]
        }
      }
    },
    "rejectedPlans": []
  },
  "executionStats": {
    "executionSuccess": true,
    "nReturned": 18254550,
    "executionTimeMillis": 188864,
    "totalKeysExamined": 18854121,
    "totalDocsExamined": 18854117,
    "executionStages": {
      "stage": "FETCH",
      "filter": {
        "geoloc": {
          "$geoWithin": {
            "$geometry": {
              "type": "Polygon",
              "coordinates": [
                [
                  [-5, 45],
                  [0, 45],
                  [0, 50],
                  [-5, 50],
                  [-5, 45]
                ]
              ]
            }
          }
        }
      },
      "nReturned": 18254550,
      "executionTimeMillisEstimate": 118035,
      "works": 18854122,
      "advanced": 18254550,
      "needTime": 599571,
      "needYield": 0,
      "saveState": 20854,
      "restoreState": 20854,
      "isEOF": 1,
      "docsExamined": 18854117,
      "alreadyHasObj": 0,
      "inputStage": {
        "stage": "IXSCAN",
        "nReturned": 18854117,
        "executionTimeMillisEstimate": 7374,
        "works": 18854122,
        "advanced": 18854117,
        "needTime": 4,
        "needYield": 0,
        "saveState": 20854,
        "restoreState": 20854,
        "isEOF": 1,
        "keyPattern": {
          "geoloc": "2dsphere"
        },
        "indexName": "geoloc_2dsphere",
        "isMultiKey": false,
        "multiKeyPaths": {
          "geoloc": []
        },
        "isUnique": false,
        "isSparse": false,
        "isPartial": false,
        "indexVersion": 2,
        "direction": "forward",
        "indexBounds": {
          "geoloc": [
            "[936748722493063168, 936748722493063168]",
            "[954763121002545152, 954763121002545152]",
            "[959266720629915648, 959266720629915648]",
            "[960392620536758272, 960392620536758272]",
            "[960674095513468928, 960674095513468928]",
            "[960744464257646592, 960744464257646592]",
            "[960762056443691008, 960762056443691008]",
            "[960766454490202112, 960766454490202112]",
            "[960767554001829888, 960767554001829888]",
            "[960767828879736832, 960767828879736832]",
            "[960767897599213568, 960767897599213568]",
            "[960767914779082752, 960767914779082752]",
            "[960767919074050048, 960767919074050048]",
            "[960767920147791872, 960767920147791872]",
            "[960767920416227328, 960767920416227328]",
            "[960767920483336192, 960767920483336192]",
            "[960767920500113408, 960767920500113408]",
            "[960767920504307712, 960767920504307712]",
            "[960767920505356288, 960767920505356288]",
            "[960767920505618432, 960767920505618432]",
            "[960767920505683968, 960767920505683968]",
            "[960767920505683969, 960767920505716735]",
            "[1345075088707977217, 1345075088708009983]",
            "[1345075088708009984, 1345075088708009984]",
            "[1345075088708075520, 1345075088708075520]",
            "[1345075088708337664, 1345075088708337664]",
            "[1345075088709386240, 1345075088709386240]",
            "[1345075088713580544, 1345075088713580544]",
            "[1345075088730357760, 1345075088730357760]",
            "[1345075088797466624, 1345075088797466624]",
            "[1345075089065902080, 1345075089065902080]",
            "[1345075090139643904, 1345075090139643904]",
            "[1345075094434611200, 1345075094434611200]",
            "[1345075111614480384, 1345075111614480384]",
            "[1345075180333957120, 1345075180333957120]",
            "[1345075455211864064, 1345075455211864064]",
            "[1345076554723491840, 1345076554723491840]",
            "[1345080952770002944, 1345080952770002944]",
            "[1345098544956047360, 1345098544956047360]",
            "[1345168913700225024, 1345168913700225024]",
            "[1345450388676935680, 1345450388676935680]",
            "[1346576288583778304, 1346576288583778304]",
            "[1351079888211148800, 1351079888211148800]",
            "[1369094286720630784, 1369094286720630784]",
            "[5116089176692883456, 5116089176692883456]",
            "[5170132372221329408, 5170132372221329408]",
            "[5179139571476070401, 5179702521429491711]",
            "[5179702521429491713, 5180265471382913023]",
            "[5180265471382913024, 5180265471382913024]",
            "[5183643171103440896, 5183643171103440896]",
            "[5187020870823968768, 5187020870823968768]",
            "[5187020870823968769, 5187583820777390079]",
            "[5187583820777390081, 5188146770730811391]",
            "[5188146770730811393, 5197153969985552383]",
            "[5206161169240293376, 5206161169240293376]",
            "[5218264593238851584, 5218264593238851584]",
            "[5218264593238851585, 5218405330727206911]",
            "[5218546068215562240, 5218546068215562240]",
            "[5218546068215562241, 5219109018168983551]",
            "[5219671968122404864, 5219671968122404864]",
            "[5220234918075826177, 5220797868029247487]",
            "[5220797868029247488, 5220797868029247488]",
            "[5220938605517602817, 5221079343005958143]",
            "[5221079343005958144, 5221079343005958144]",
            "[5260204364768739328, 5260204364768739328]"
          ]
        },
        "keysExamined": 18854121,
        "seeks": 5,
        "dupsTested": 0,
        "dupsDropped": 0
      }
    }
  },
  "serverInfo": {
    "host": "Edwins-MBP.cable.rcn.com",
    "port": 27017,
    "version": "4.4.1",
    "gitVersion": "ad91a93a5a31e175f5cbf8c69561e788bbc55ce1"
  },
  "ok": 1
}

However, we agree that the planning system should be able to recognize when a collection scan is faster than performing an index scan, and should select the faster plan. I'll close this as a duplicate of SERVER-13065 which details this improvement.

Best,
Edwin

Comment by Stavros Koureas [ 14/Jan/21 ]

In addition to the original issue, to simplify more the things., also this query has the same behavior (only match):

([
  {
    $match: {
      geoloc: {
        $geoWithin: {
          $geometry: {
            type: "Polygon",
            coordinates: [
              [
                [-5.0, 45.0],
                [+0.0, 45.0],
                [+0.0, 50.0],
                [-5.0, 50.0],
                [-5.0, 45.0],
              ],
            ],
          },
        },
      },
    },
  },
]);

Comment by Stavros Koureas [ 12/Jan/21 ]

The mongodb version is:

mongod -version 4.4.1

 

The used dataset is here:

https://zenodo.org/record/1167595#.YAB_jegzaUk

 

The pack file is:

[P1] AIS Data.zip

 

The csv used for the collection is:

nari_dynamic.csv

Comment by Stavros Koureas [ 12/Jan/21 ]

In addition to the original issue, to simplify things., also this query has the same behavior (no time dependency):

([
  {
    $match: {
      geoloc: {
        $geoWithin: {
          $geometry: {
            type: "Polygon",
            coordinates: [
              [
                [-5.0, 45.0],
                [+0.0, 45.0],
                [+0.0, 50.0],
                [-5.0, 50.0],
                [-5.0, 45.0],
              ],
            ],
          },
        },
      },
    },
  },
  { $group: { _id: "$sourcemmsi" } },
  { $limit: 100 },
  { $project: { _id: 1 } },
]);

Generated at Thu Feb 08 05:31:39 UTC 2024 using Jira 9.7.1#970001-sha1:2222b88b221c4928ef0de3161136cc90c8356a66.