-
Type:
New Feature
-
Resolution: Unresolved
-
Priority:
Major - P3
-
None
-
Affects Version/s: None
-
Component/s: None
-
Replication
-
None
-
None
-
None
-
None
-
None
-
None
-
None
Reduce the impact of the many serial writes by carefully merging them before applying them. Consider the following series of writes:
db.coll.update( { _id : 'a' }, { $set : { foo : 1, bar : 2 } } ); db.coll.update( { _id : 'a' }, { $set : { bar : 3, baz : 4 } } ); db.coll.update( { _id : 'a' }, { $set : { baz : 5 } } );
It’s clear that each query contributes something to the final result, but no query contains all of the necessary information, so we can’t simply omit the first two and apply the last one. However, by iterating through them in order in memory, we can produce a single query that results in the correct state of the document as though we had applied all 3 original queries but by doing the work of a single query:
db.coll.update(
{ _id : 'a' },
{ $set : { foo : 1, bar : 3, baz : 5 } }
);
This doesn't work if PIT reads are required because it loses the intermediate history. However, only 0.2% of operations and < 5% of Atlas clusters use snapshot read concern. Furthermore, customers running a large initial sync do not need PIT reads at all while the node is syncing.
For a more thorough discussion of the idea, see this doc.
A simple POC shows speedups of 14-28% when running 100k w:majority updates to as many as 10k documents:
docSize (bytes) | nDocs | speedup |
1024 | 1 | 17.06% |
1024 | 10 | 17.09% |
1024 | 100 | 15.76% |
1024 | 1000 | 13.90% |
1024 | 10000 | 14.69% |
16384 | 1 | 27.28% |
16384 | 10 | 27.62% |
16384 | 100 | 21.17% |
16384 | 1000 | 17.39% |
16384 | 10000 | 26.71% |
- related to
-
SERVER-110867 Spread index updates across threads
-
- Closed
-