Uploaded image for project: 'Core Server'
  1. Core Server
  2. SERVER-85576

$push accumulator memory usage

    • Type: Icon: Task Task
    • Resolution: Unresolved
    • Priority: Icon: Major - P3 Major - P3
    • None
    • Affects Version/s: None
    • Component/s: None
    • None
    • Atlas Streams

      When restoring from a large state checkpoint, the restore fails and it seems to be due to this:

       

      (ExceededMemoryLimit) $push used too much memory and cannot spill to disk. Memory limit: 104857600 bytes

      The full logs for this can be obtained from this query in splunk:

      link title

       

      This SP first created a largish checkpoint (~5.8GB) and was then stopped. When it was later started, the start fails due to the ^ error.  The query that the SP is running is:

       

      constpipeline= [
      {
      $source: {
      connectionName:"Cluster0",
      db:"mk-testdb",
      coll:"inputColl",
      timeField:
      { $toDate:"$fullDocument.ts", }
      }
      },
      {$replaceRoot: {newRoot: "$fullDocument"}},
      {
      $project:
      { value: \{$range: [1, "$idx"]}
      ,
      ts:"$ts",
      }
      },
      {$unwind: "$value"},
      {
      $addFields:
      { "customerId": \{$mod: ["$value", 50]}
      ,
      "max":"$value",
      "idarray0": ["$_id", "$_id", "$_id", "$_$id", "$_id", "$_id"],
      "idarray1": ["$_id", "$_id", "$_id", "$_$id", "$_id", "$_id"],
      "idarray2": ["$_id", "$_id", "$_id", "$_$id", "$_id", "$_id"],
      "idarray3": ["$_id", "$_id", "$_id", "$_$id", "$_id", "$_id"],
      }
      },
      {
      $tumblingWindow:
      { interval: \{size:NumberInt(3), unit:"hour"}
      ,
      allowedLateness: {size:NumberInt(0), unit:"second"},
      pipeline: [{
      $group:
      {_id: "$customerId", customerDocs: {$push: "$$ROOT"}, max: {$max: "$max"}}
      }]
      }
      },
      {$project: {customerId: "$_id", max: "$max"}},
      {
      $merge:
      { into: \{connectionName:"Cluster0", db:"mk-testdb", coll:"outputColl"}
      ,
      }
      }
      ];
      

       

      The interesting thing about this failure is that it did not fail when taking the checkpoint but only when trying to restore from it.

            Assignee:
            Unassigned Unassigned
            Reporter:
            mayuresh.kulkarni@mongodb.com Mayuresh Kulkarni
            Votes:
            0 Vote for this issue
            Watchers:
            2 Start watching this issue

              Created:
              Updated: