Uploaded image for project: 'Core Server'
  1. Core Server
  2. SERVER-26534

Text search uses excessive memory

    • Query Integration
    • ALL

      As in SERVER-18926 and SERVER-18961 create a collection containing 50 M documents totaling about 4 GB in size with a full-text index, then do a simple search for a single word on the full-text index that returns all the documents.

      Total memory allocated excluding WT cache is roughly the size of the collection. The top four allocating stacks, accounting for most of the excess:

      heapProfile stack44: { 0: "tc_malloc", 1: "mongo::mongoMalloc", 2: "mongo::BSONObj::copy", 3: "mongo::BSONObj::getOwned", 4: "mongo::WorkingSetMember::makeObjOwnedIfNeeded", 5: "mongo::TextOrStage::addTerm", 6: "mongo::TextOrStage::readFromChildren", 7: "mongo::TextOrStage::work", 8: "mongo::TextMatchStage::work", 9: "mongo::TextStage::work", 10: "mongo::PlanExecutor::getNextImpl", 11: "mongo::PlanExecutor::getNext", 12: "mongo::FindCmd::run", 13: "mongo::Command::run", 14: "mongo::Command::execCommand", 15: "mongo::runCommands", 16: "mongo::assembleResponse", 17: "mongo::MyMessageHandler::process", 18: "mongo::PortMessageServer::handleIncomingMsg", 19: "0x7f89ec7466aa", 20: "clone" }
      heapProfile stack41: { 0: "tc_new", 1: "mongo::WorkingSet::allocate", 2: "mongo::IndexScan::work", 3: "mongo::TextOrStage::readFromChildren", 4: "mongo::TextOrStage::work", 5: "mongo::TextMatchStage::work", 6: "mongo::TextStage::work", 7: "mongo::PlanExecutor::getNextImpl", 8: "mongo::PlanExecutor::getNext", 9: "mongo::FindCmd::run", 10: "mongo::Command::run", 11: "mongo::Command::execCommand", 12: "mongo::runCommands", 13: "mongo::assembleResponse", 14: "mongo::MyMessageHandler::process", 15: "mongo::PortMessageServer::handleIncomingMsg", 16: "0x7f89ec7466aa", 17: "clone" }
      heapProfile stack46: { 0: "tc_new", 1: "void std::vector<mongo::IndexKeyDatum, std::allocator<mongo::IndexKeyDatum> >::_M_emplace_back_aux<mongo::IndexKeyDatum>", 2: "mongo::IndexScan::work", 3: "mongo::TextOrStage::readFromChildren", 4: "mongo::TextOrStage::work", 5: "mongo::TextMatchStage::work", 6: "mongo::TextStage::work", 7: "mongo::PlanExecutor::getNextImpl", 8: "mongo::PlanExecutor::getNext", 9: "mongo::FindCmd::run", 10: "mongo::Command::run", 11: "mongo::Command::execCommand", 12: "mongo::runCommands", 13: "mongo::assembleResponse", 14: "mongo::MyMessageHandler::process", 15: "mongo::PortMessageServer::handleIncomingMsg", 16: "0x7f89ec7466aa", 17: "clone" }
      heapProfile stack45: { 0: "tc_new", 1: "mongo::TextOrStage::addTerm", 2: "mongo::TextOrStage::readFromChildren", 3: "mongo::TextOrStage::work", 4: "mongo::TextMatchStage::work", 5: "mongo::TextStage::work", 6: "mongo::PlanExecutor::getNextImpl", 7: "mongo::PlanExecutor::getNext", 8: "mongo::FindCmd::run", 9: "mongo::Command::run", 10: "mongo::Command::execCommand", 11: "mongo::runCommands", 12: "mongo::assembleResponse", 13: "mongo::MyMessageHandler::process", 14: "mongo::PortMessageServer::handleIncomingMsg", 15: "0x7f89ec7466aa", 16: "clone" }
      

      By experiment it appears that the amount of memory used is proportional (possibly roughly equal in size) to the number of documents returned.

        1. Text_search.png
          40 kB
          Linda Qin
        2. text.png
          167 kB
          Bruce Lucas

            Assignee:
            backlog-query-integration [DO NOT USE] Backlog - Query Integration
            Reporter:
            bruce.lucas@mongodb.com Bruce Lucas (Inactive)
            Votes:
            11 Vote for this issue
            Watchers:
            41 Start watching this issue

              Created:
              Updated: