[DOCS-15459] Create "Known Issue" docs for ChangeStreamHistoryLost Trigger error Created: 29/Jun/22 Updated: 04/Jan/23 Resolved: 14/Jul/22 |
|
| Status: | Closed |
| Project: | Documentation |
| Component/s: | Realm |
| Affects Version/s: | None |
| Fix Version/s: | None |
| Type: | Task | Priority: | Major - P3 |
| Reporter: | Adam Harrison | Assignee: | Nathan Contino (Inactive) |
| Resolution: | Done | Votes: | 0 |
| Labels: | None | ||
| Remaining Estimate: | Not Specified | ||
| Time Spent: | Not Specified | ||
| Original Estimate: | Not Specified | ||
| Issue Links: |
|
||||||||
| Participants: | |||||||||
| Days since reply: | 1 year, 29 weeks, 6 days ago | ||||||||
| Description |
|
A common Trigger error is a Trigger becoming suspended as a result of the error: (ChangeStreamHistoryLost) Resume of change stream was not possible, as the resume point may no longer be in the oplog. We should create an entry in the Known Issues & Workaround documentation for Atlas App Services describing the meaning of this error and the implications. Having a public resource for this error may provide us the opportunity to link users to this guidance in the UI whenever these errors manifest, offering proactive guidance and improving the user experience. |
| Comments |
| Comment by Mansoor Omar [ 15/Jul/22 ] |
|
I've explained some of this in the community thread below: |
| Comment by Adam Harrison [ 06/Jul/22 ] |
|
Some additional context / thoughts: Ideally this should include some brief discussion about the oplog window, but unfortunately we don't have any good publicly facing documentation that clearly describes the oplog window (
I responded to a case with the following language, trying to get around defining the oplog window: The ChangeStreamHistoryLost error is an indication that your Database Trigger is attempting to restart from a point-in-time which is no longer contained within the replica set oplog - an internal collection which keeps a rolling list of all operations which modify data stored in your database. Database Triggers leverage the data stored in this collection to understand what types of write operations have been performed. The Trigger can be restarted without a resume token, but this means that there will be one or more modifications which occurred on the cluster for which the Trigger will not have processed. See Resume a Suspended Trigger for more information.
We should also mention that having event ordering enabled for a Trigger can be a leading cause of Triggers falling off the oplog. We can link to https://www.mongodb.com/docs/atlas/app-services/triggers/database-triggers/#disable-event-ordering-for-burst-operations for this. |
| Comment by Chris Bush [ 06/Jul/22 ] |
|
Should be documented in maxOfflineTime |