Uploaded image for project: 'Core Server'
  1. Core Server
  2. SERVER-83449

fixDocumentForInsert iterates through each document up to four times

    • Type: Icon: Task Task
    • Resolution: Unresolved
    • Priority: Icon: Major - P3 Major - P3
    • None
    • Affects Version/s: None
    • Component/s: None
    • None
    • Query Execution

      fixDocumentForInsert goes through each document to insert four times:
      1. First to validate the depth here
      2. Then we iterate through the document to validate it (check if there are Timestamps needing fixing, if _id is present more than once, etc.)
      3. If the _id is not the first element in the BSON, we fetch the _id element, which under the hood iterates through the BSON doc again.
      4. We iterate through the BSON doc again to copy elements into the new BSON doc here

      We should be needing at most two passes to do this (maybe even less if we are clever). And some of these steps are easy to avoid. For example, step (3) above can be avoided by remembering where the _id field is in step (2).

      I also think we can generate UUIDs (b.appendOID() / reserve optimes to fill in timestamps in batches instead of one at a time. We see (look at attached flamegraphs) that it's taking a considerable amount of time.

      See comments for more info.

        1. perf-test_phase-0000_flamegraph.svg
          440 kB
          Vishnu Kaushik
        2. perf-test_phase-0000_flamegraph-1.svg
          425 kB
          Vishnu Kaushik

            Assignee:
            Unassigned Unassigned
            Reporter:
            vishnu.kaushik@mongodb.com Vishnu Kaushik
            Votes:
            0 Vote for this issue
            Watchers:
            11 Start watching this issue

              Created:
              Updated: