While profiling mongod under VTune, running the Queries.IntNonIdFindOn mongo-perf test with 8 threads I found that roughly 6% of the FindCmd run time is spent in Explain::getPlanSummary, generating a plan summary string. Of this 6%, 2/3rds is spent in StringBuilderImpl::appendDoubleNice(), converting index direction (in this case the number 1) from double to string.
We should look to optimize this path, either with StringBuilder optimization or custom parsing in getPlanSummary.
It is worth noting that changing the createIndex call in "Queries.IntNonIdFindOn" to pass a NumberInt(1) rather than the default of double reduces the cost by 50%. The remaining 50% is spent in StringBuilderImpl operator<<(int) which could be optimized for small integers as was done in our itoa() implementation.
See attachment for VTune screenshot