Uploaded image for project: 'Core Server'
  1. Core Server
  2. SERVER-83163

Replace RecordData with BSONObj

    • Type: Icon: Improvement Improvement
    • Resolution: Unresolved
    • Priority: Icon: Major - P3 Major - P3
    • None
    • Affects Version/s: None
    • Component/s: None
    • Labels:
      None
    • Storage Execution

      Record stores used to store non-bson data back in the MMAPv1 days. Admittedly, it was internal to the MMAPv1 implementation, but it was still using the public APIs to store its btree buckets in an MmapV1RecordStore. That code is of course long gone, but the idea the record stores could store anything other than BSON remains by way of the RecordData type. I believe we now only store BSON in RecordStores, with the possible exception of some old unit tests which should be easy to convert to storing BSON.

      I think this would be a nice cleanup of the storage engine concepts because it makes it clear that a RecordStore is logically a map<RecordId, BSON> and not some arbitrary blob of unknown format. Of course, this may not be ideal if we are planning to put non-bson data in a RecordStore (at the boundary of the storage engine API). But even then, it may still be a good idea to do this, to force us to use a separate type (possibly with a common base) for data structures that map from RecordIds to other formats.

      While IMO it is worth doing this for the conceptual clarity alone, there are at least a few practical benefits I can see that this unlocks:

      1. BSONObj is already set up to have owned views to slices of ConstSharedBuffers while RecordData isn't (not that it would be too hard to make it so). While currently we never take advantage of the owned data in the storage engine and always copy, we've identified a few places where that would be benefit from doing so.
      2. RecordData probably should use ConstSharedBuffer rather than (mutable) SharedBuffer since the recievers of those should never modify them (at least without first checking isShared().)
      3. Storage engines should be able to take advantage of the fact that they know that the values will be in BSON format. For example, they could either omit the size and trailing 0 byte, or (more likely) they could omit thier own storage of the size and just use the first 4 bytes of the BSON. Additionally, they could use BSON-specific compression of values without first validating that the data was actually bson.

            Assignee:
            backlog-server-execution [DO NOT USE] Backlog - Storage Execution Team
            Reporter:
            mathias@mongodb.com Mathias Stearn
            Votes:
            0 Vote for this issue
            Watchers:
            7 Start watching this issue

              Created:
              Updated: