Streams: Investigate failures like "Command... requires authentication: generic server error"

XMLWordPrintableJSON

    • Type: Task
    • Resolution: Unresolved
    • Priority: Major - P3
    • None
    • Affects Version/s: None
    • Component/s: None
    • None
    • Atlas Streams
    • Sprint 58
    • None
    • 3
    • None
    • None
    • None
    • None
    • None
    • None

      We frequently observe failures in prod smoke tests like below:

      • "$merge to immortal-aws-production-virginia-usa-SP30-kanopy-data.output-kanopy_immortal_smoke_test_221b96fc_ab848ab2 failed: Command update requires authentication: generic server error"
      • Change stream $source immortal-aws-production-london-gbr-SP30-kanopy-data.input-kanopy_immortal_smoke_test_221b96fc_59e45b29 failed: Command aggregate requires authentication: generic server error

      This tends to happen in our "immortal" smoke test processors. These run forever. They are inactive for 5/10 minutes at a time and then wake up to read/write data.

      1. Are the certs being correctly rotation on the local disk?
      2. Is mongocxx using the updated cert after rotation?
      3. Do we need to configure a retry knob on mongocxx? Or do we need to add retry logic in our code?

            Assignee:
            Unassigned
            Reporter:
            Matthew Normyle
            Votes:
            0 Vote for this issue
            Watchers:
            1 Start watching this issue

              Created:
              Updated: