Uploaded image for project: 'Documentation'
  1. Documentation
  2. DOCS-14763

$sample aggregation pipeline incorrectly warns "$sample may output the same document more than once in its result set."

    XMLWordPrintable

    Details

    • Type: Bug
    • Status: Closed
    • Priority: Major - P3
    • Resolution: Fixed
    • Affects Version/s: None
    • Fix Version/s: None
    • Component/s: manual, Server
    • Last comment by Customer:
      true
    • Story Points:
      2
    • Sprint:
      ServerDocs2021: Aug31 - Sep07, ServerDocs2021: Sep07 - Sep14, ServerDocs2021: Sep14 - Sep21, ServerDocs2021: Sep21 - Sep28, ServerDocs2021: Sep28 - Oct5, ServerDocs2021: Oct5 - Oct12, ServerDocs2021: Oct12 - Oct19

      Description

      The documentation for the $sample aggregation pipeline warns:

      $sample may output the same document more than once in its result set.
      

      It appears this warning may be residual from the introduction of the feature in 3.2, where duplicates were possible when using MMAPv1. When using WiredTiger, there are two methods for $sample to obtain random documents.

      The first of which uses a pseudo-random cursor to select documents, which has a means to prevent duplicates from being returned, and will error if it falls short of accomplishing deduplication.

      The second method will perform a collection scan by _id, which should never return duplicates when WiredTiger is employed but may have resulted in duplicates with MMAPv1.

      My understanding is that the warning should only be applicable when MMAPv1 was potentially in use as the storage engine, as neither method used by $sample to obtain random documents will return duplicates when WiredTiger is in use.

      As it stands now, this warning may (unnecessarily) prevent this feature from being considered for a number of use cases.

        Attachments

          Activity

            People

            Assignee:
            jeffrey.allen Jeffrey Allen
            Reporter:
            dave.walker David Walker
            Participants:
            Last commenter:
            Githook User Githook User
            External Reviewer:
            David Walker
            Votes:
            0 Vote for this issue
            Watchers:
            3 Start watching this issue

              Dates

              Created:
              Updated:
              Resolved:
              Days since reply:
              14 weeks, 1 day ago
              Date of 1st Reply: