Uploaded image for project: 'Core Server'
  1. Core Server
  2. SERVER-10485

Make Binary type a "simple type" to speed up queries on Binary _id

    • Type: Icon: Improvement Improvement
    • Resolution: Done
    • Priority: Icon: Major - P3 Major - P3
    • 2.5.5
    • Affects Version/s: None
    • Component/s: Querying
    • Labels:
      None

      I am testing a C++ app that saves a 300 byte document with _id as either:

      1. a fairly long string (~ 34 bytes) of the form:

      > db.visitors.findOne()

      {
              "_id" : "280-827986500-402998089-4240815191",
      …
      > db.visitors.findOne()._id.length
      34
      

      2. a 22 byte Binary field containing the same data

      Now when running performance test with single threaded client, I get the following results:

      1. Strings:

      29698: aggregate rate = 2134 2004 2007 2098 2366 2395 2124 2063 2141 2102 2043 2066 2116 2113 2111 2090 2088 2058 2368 2505 2100 2027 2031 2053 2062 2110 2213 2061 2002 2120 2058 2026 2019 2030 2032 1992 2016 2018 1988 2031 2020 2037 2019 2031 2028 2026 2067 2044 
      29698: aggregate readAvgNS = 152619 152110 150799 157222 161077 160110 155530 152287 156510 156442 151876 139827 131069 131112 126331 129008 152680 153813 149854 148568 154517 153699 154555 152451 138707 128859 145695 151509 150705 153666 150601 150790 150867 151283 152592 153590 153355 152964 159347 153164 153401 153124 154860 153072 153074 152854 150535 154109 
      29698: aggregate writeAvgNS = 101351 123051 123073 98098 46281 46139 102489 123042 97479 103767 122946 132290 136879 132588 136525 138369 124252 123937 75534 43246 103055 124306 124766 124239 133122 135499 98666 119376 122301 95724 121964 122182 122733 120476 123306 124395 124729 124016 124640 124062 124313 124681 124675 124268 125203 125423 122833 122051 
      29698: aggregate writeFailures = 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0
      

      2. Binaries:

      29560: aggregate rate = 1793 1695 1696 2039 1918 1706 1722 1725 1735 1727 1726 1722 1724 1781 1821 1832 1827 1825 1835 1785 1770 1770 1899 2198 2199 2209 2261 2238 1876 2219 1828 1727 1720 1819 1796 1722 1723 1733 1732 1741 2038 1983 1769 1796 1716 1733 1746 1748 1743 1745 1747 1738 1749 1759 1748 
      29560: aggregate readAvgNS = 208258 202358 201183 210515 212805 201878 202447 201497 203457 201564 201167 201208 201333 185882 177483 170058 169452 168176 165467 198294 198905 198807 203659 196598 195868 199160 193123 196131 184402 176039 205224 203780 197122 208058 207021 201362 198126 197326 196977 197363 208257 213072 204922 208032 202003 197872 197860 198284 197574 197781 197159 199347 197974 197439 198008 
      29560: aggregate writeAvgNS = 133385 162305 162570 59456 91507 162487 162411 163434 162299 163236 161950 161503 161718 162210 165318 166879 166798 167799 168046 159536 159398 159329 130241 40697 40857 40282 40082 41907 137324 64410 134221 153291 159127 115054 131668 162539 160235 158427 159149 159578 61582 71481 137752 132059 162983 160126 158960 159220 159261 160071 159127 159868 160034 159347 160988 
      29560: aggregate writeFailures = 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0
      

      The difference is pretty flagrant, about 2000 ops/s for #1 and 1700 for #2. Both readAvgNS and writeAvgNS are significantly higher in the case of #2.

      Looking at code:

      • BinData is not considered "simple" so,
      • isSimpleIdQuery() returns false for {_id : BinData} queries, so
      • queryIdHack() is not invoked, so
      • the binary version gets to run through the full query engine

      If you add mongo::BinData to BSONElement::isSimpleType() the binary version goes from about 25% slower to about 4% faster. This is annoying since using Binaries results in index that is about 30% smaller in size.

            Assignee:
            hari.khalsa@10gen.com hari.khalsa@10gen.com
            Reporter:
            antoine Antoine Girbal
            Votes:
            2 Vote for this issue
            Watchers:
            7 Start watching this issue

              Created:
              Updated:
              Resolved: