Uploaded image for project: 'Core Server'
  1. Core Server
  2. SERVER-46979

Improve changeStream performance relative to oplog queries

    • Server Triage
    • Fully Compatible

      I've made a test between reading oplog and change stream, it looks like the change stream performance is not as good as reading oplog directly. Here comes my experiment:

      The fetcher program locates on hostB while the MongoDB locates on hostA, the ping delay from hostA to hostB is about 0.2~0.3ms. There're no cpu/memory/io/network bottleneck problem in my experiment.
      I have 500w oplog with total size 5.5G on the source MongoDB:

      • Change stream:180 seconds. near 3w qps
      • oplog:60 seconds. about 8w+ qps

      In the monitor of change stream fetching, the CPU runs about 60%. I think the gap is slightly bigger. Does this basically match your previous test results?
      As I knew, for a replica set, the change stream will be split into two parts. The first is the $match oplog $cursor that can be seen on the aggregate explain command. The second part is transforming that will do some steps:

      • unmarshal oplog bson
      • allocate new memory and transform parsed oplog to change stream event
      • marshal change stream event into bson

      So the main reason for the performance gap is the transforming step. Please let me known if I am wrong.

      In my point of view, there is only 1 thread in oplog fetching and transforming, so increasing the threads can improve the performance. But it will be a tradeoff because it may affect the MongoDB server performance if too many threads are used to do the transform.

            backlog-server-triage [HELP ONLY] Backlog - Triage Team
            cvinllen@gmail.com vinllen chen
            0 Vote for this issue
            11 Start watching this issue