Uploaded image for project: 'Mongoid'
  1. Mongoid
  2. MONGOID-5755

Performance: Have option to return BSON ObjectIds as String

    • Type: Icon: Improvement Improvement
    • Resolution: Unresolved
    • Priority: Icon: Unknown Unknown
    • None
    • Affects Version/s: None
    • Component/s: Performance
    • Labels:
      None
    • Ruby Drivers

      Mongoid (and Ruby Driver) should give user the option to have all BSON IDs returned as strings rather than BSON::ObjectId. The option should ideally be global and per query. The reason is performance, for any performance intensive application we always call .to_s (to string) on all our BSON IDs, and it can result in some pretty massive gains when working with a large number of documents. If the driver can directly deserialize to string (bypassing an object allocation for BSON::ObjectId) its even better.

      For reference, ActiveRecord returns record IDs as Integers rather than a specialized ID type.

      AFAIK the only time BSON::ObjectId is actually beneficial is for serializing documents, but given that Mongoid has type coercion, this is handled in all cases of defined documents. The only "caveat" would be you would need to call BSON::ObjectId(id_str) in the very rare and unadvisable case that you are inserting unstructured data, for example field with type: Hash.

      Here are some rudimentary (small) benchmarks which show:

      • Comparison is about 2x slow
      • Instantiation of values is 5x slow
      • Retrieving a hash key is about 4x slow

       

             user     system      total        real
      str  - comparison     0.052049   0.000000   0.052049 (  0.052046)
      bson - comparison     0.099409   0.000000   0.099409 (  0.099417)
      str  - instantiation  0.527043   0.000000   0.527043 (  0.527036)
      bson - instantiation  2.778519   0.000000   2.778519 (  2.778765)
      str  - hash fetch     0.078862   0.000000   0.078862 (  0.078870)
      bson - hash fetch     0.296084   0.000000   0.296084 (  0.295955)
      
      

       

      str1 = 'a' * 24
      str2 = 'b' * 24
      bson1 = BSON::ObjectId(str1)
      bson2 = BSON::ObjectId(str2)
      hash_str  = { str1 => 1, str2 => 2 }
      hash_bson = { bson1 => 1, bson2 => 2 }
      Benchmark.bm do |bm|
        bm.report('str  - comparison   ') do |x|
          1000000.times { str1 == str2 }
        end
        bm.report('bson - comparison   ') do |x|
          1000000.times { bson1 == bson2 }
        end
        bm.report('str  - instantiation') do |x|
          1000000.times { 'a' * 24 == 'b' * 24 }
        end
        bm.report('bson - instantiation') do |x|
          1000000.times { BSON::ObjectId(str1) == BSON::ObjectId(str2) }
        end
        bm.report('str  - hash fetch   ') do |x|
          1000000.times { hash_str[str2] }
        end
        bm.report('bson - hash fetch   ') do |x|
          1000000.times { hash_bson[bson2] }
        end
      end

            Assignee:
            dmitry.rybakov@mongodb.com Dmitry Rybakov
            Reporter:
            shields@tablecheck.com Johnny Shields
            Votes:
            0 Vote for this issue
            Watchers:
            4 Start watching this issue

              Created:
              Updated: