[SERVER-1272] mongo::BufBuilder::grow fails to be inlined, makes appending slow Created: 21/Jun/10  Updated: 12/Jul/16  Resolved: 20/Jul/10

Status: Closed
Project: Core Server
Component/s: Internal Client
Affects Version/s: 1.5.3
Fix Version/s: 1.5.6

Type: Improvement Priority: Major - P3
Reporter: Andrew Morrow (Inactive) Assignee: Alberto Lerner
Resolution: Done Votes: 0
Labels: None
Remaining Estimate: Not Specified
Time Spent: Not Specified
Original Estimate: Not Specified
Environment:

Ubuntu 10.04 x86_64, gcc-4.4.3


Participants:

 Description   

When profiling a program making heavy use of the BSON library, I noticed that a large percentage of time was spent in mongo::BufBuilder::grow. That was odd since I had pre-configured BufBuilder objects with a generous chunk of memory, and I confirmed that realloc was not being called, so the buffer was never actually growing.

It turns out that the compiler was not inlining mongo::BufBuilder::grow. The function was emitted out-of-line and was being called through the PLT to do so much as append an integer to an already allocated buffer.

I was able to improve throughput a good bit by partitioning BufBuilder::grow into a hot inlined function do to the space check, and an explicitly non-inlined cold function to handle reallocating on overflow:

/* returns the pre-grow write position */
inline char* grow(int by) {
int oldlen = l;
l += by;
if ( l > size )

{ grow_reallocate(); }

return data + oldlen;
}

void grow_reallocate() _attribute_((noinline))

{ int a = size * 2; if ( a == 0 ) a = 512; if ( l > a ) a = l + 16 * 1024; if( a > 64 * 1024 * 1024 ) msgasserted(10000, "BufBuilder grow() > 64MB"); data = (char *) realloc(data, a); size= a; }

After this change, the available space check was inlined during BufBuilder::append calls, and grow_reallocate was emitted out of line (and never called) This seemed to buy me about a 20% improvement in throughput while constructing complex BSON documents.

The _attribute((noinline)) is necessary here, since I don't want grow_reallocate inlined into 'grow'. Ideally, grow_reallocate would be at .cc scope, not in the header at all, but perhaps you want to keep the BSON library 'header only'. If that is the case, then maybe think about macroizing __attribute_((noinline)), and then marking cold functions, or functions that make expensive library calls like 'realloc', as non-inline-able, as there is little or no benefit to inlining them. Similarly, splitting in-header functions that may be complicated enough that the compiler could reasonably choose not to inline into an inlined hot/fast-path and a non-inlined slow/cold-path could allow the compiler to more aggressively inline common cases.



 Comments   
Comment by auto [ 20/Jul/10 ]

Author:

{'login': 'alerner', 'name': 'Alberto Lerner', 'email': 'alerner@10gen.com'}

Message: SERVER-1272 Inline hot portion of BufBuild::grow()
http://github.com/mongodb/mongo/commit/9c33f219fb8bdc414089dff8c4dcbc8d65d0ee8f

Comment by auto [ 06/Jul/10 ]

Author:

{'login': 'alerner', 'name': 'Alberto Lerner', 'email': 'alerner@10gen.com'}

Message: SERVER-1272 Separate slow portion of hot function
http://github.com/mongodb/mongo/commit/62a6e478d619c6ce27cadec5d533ea207d9e0790

Comment by Alberto Lerner [ 06/Jul/10 ]

There's an important set of functionality that does not require linking libmongoclient and we would like to maintain the possibility of pulling all BSON out in an independent module. Let me pull the hot/cold portions apart and do some testing to see how consistently (or not) the inlining is happening. I'll be posting the commits to this item.

By the way, excellent profiling work – not only no this item, but in the whole BSON series items.

Comment by Andrew Morrow (Inactive) [ 29/Jun/10 ]

RE my comment about 'header-only', its clearly not the case that you can use all of the BSONObj methods without linking libmongoclient: BSONObj::jsonString, for instance, is not defined in the headers anywhere. So maybe rather than messing about with _attribute_((noinline)) it would be better to just do the basic hot/cold partitioning and move the cold functions out-of-line into libmongoclient.

Generated at Thu Feb 08 02:56:34 UTC 2024 using Jira 9.7.1#970001-sha1:2222b88b221c4928ef0de3161136cc90c8356a66.