Uploaded image for project: 'Core Server'
  1. Core Server
  2. SERVER-10595

Replicating many array operations is excessively slow

    • Fully Compatible
    • ALL
    • Hide

      Clone https://github.com/nelhage/mongod-tests/ and run push.py

      Observe that long after the 10,000 $push ops have been applied, both secondaries are spinning CPU and are at 100% lock percentage. On my machine, with mongo 2.4.6, it takes around 90s for the secondaries to catch up.

      Show
      Clone https://github.com/nelhage/mongod-tests/ and run push.py Observe that long after the 10,000 $push ops have been applied, both secondaries are spinning CPU and are at 100% lock percentage. On my machine, with mongo 2.4.6, it takes around 90s for the secondaries to catch up.

      The code to apply operations to arrays is excessively inefficient in several ways. In addition to causing performance problems, there are many ops that are fast to apply on the primary, but extremely slow on secondaries, which can result in replicating lagging massively and secondaries becoming useless (spinning at 100% CPU and write lock).

      There are at least two causes for slowness:

      • BSONArrayIteratorSorted performs an unnecessary expensive sort on the array in order to walk it in order.
      • ModSetState::createNewFromMods calls compareDottedFieldNames for every element of the array, even though we already know the keys to be integers in order.

      For the second issue, the issue is exacerbated by the number of allocations involved in compareDottedFieldNames:

      • update_internal allocates two std::string's to pass in, since it has char*'s. One of the std::string's is a concatenation of "root" and "e.fieldName()", even though we know that m->second->m->fieldName must already begin with "root", so we're doing an unnecessary concatenation and re-compare of the prefix.
      • compareDottedFieldNames calles std::string::substr(), which must allocate memory for the substring, to pass to LexNumCmp, even though LexNumCmp accepts a StringData, which could be constructed in O(1) for the relevant component.

      Changing BSONArrayIteratorSorted to just backend to a BSONObjectIterator would fix the unneeded sort.

      Changing compareDottedFieldNames to accept and use StringData would fix most of the allocations, and changing createNewFromMods to compare e.fieldName() and m->second->m->fieldName + root.size() (instead of doing the concatenation) would fix the rest.

            Assignee:
            andrew.morrow@mongodb.com Andrew Morrow (Inactive)
            Reporter:
            nelhage Nelson Elhage
            Votes:
            2 Vote for this issue
            Watchers:
            10 Start watching this issue

              Created:
              Updated:
              Resolved: