-
Type: Task
-
Resolution: Works as Designed
-
Priority: Minor - P4
-
None
-
Affects Version/s: None
-
Component/s: None
-
?
-
7682
-
Not Needed
Summary:
The first rule of Realm Sync ("deletes always win") can produce data loss when combined with a fairly ubiquitous model design pattern. This is not a bug, but rather a design decision in Realm that I think could be improved? I'm unsure if Core is the right spot; there does not seem to be a repo related to Atlas Device Sync.
Context:
Consider these two objects (in Swift, here):
final class AudioClip: Object { @Persisted(primaryKey: true) var _id: UUID @Persisted var path: Filepath? }
final class Filepath: Object { @Persisted(primaryKey: true) var _id: UUID @Persisted var path: String = "" @Persisted(originProperty: "path") var associatedAudioClips: LinkingObjects<AudioClip> }
Suppose we have millions of AudioClip objects in our database but only a few thousand Filepath objects. Instead of duplicating the same long, constant String (a file path) millions of times, we store that String only once and "re-use" it across many AudioClips.
(I do understand that Mongo's advice is "prefer embedding". But in this case, we'd be duplicating the exact same String millions of times, so the size-savings is significant.)
The Issue:
When our app deletes an AudioClip, it would like to make sure no orphaned Filepath objects remain. So the app might inspect AudioClip.path and if associatedAudioClips has only one entry, the app can delete the Filepath object that's no longer used by any other AudioClip.
But, suppose the app is offline when that delete occurs. During that offline period, another user on a different device adds a new AudioClip item and sets path to reference the Filepath object that has been deleted in the offline session.
When the offline user reconnects, the first rule of the conflict resolution algorithm kicks in and the Filepath is deleted, leaving the new AudioClip item that the second user added with nil for its path property—unexpected corruption.
The worst part: There is never a "safe" time to delete a Filepath object. The only way to do so with 100% certainty that no database corruption will occur is to disable sync, perform the deletes, then force a client-reset on all users.
A Fix?
"Deletes always win" works well when the conflict is: "One person changed a property of X and another person deleted X."
But in the case of relationships, sync needs a recycle bin. When a delete occurs, the object should not be actually vaporized until the expiration of the client-reset period configured in Atlas. Because, at any time during that window, another user may arrive at the sync server with a request to create a relationship to that deleted object. When the second user created that relationship, he had no way of knowing the destination object was "doomed". On his device (until the first user syncs up), the relationship is assigned and valid and, critically:
- The first user would not have performed the delete if he had knowledge of both changes.
- The second user would have created a different Filepath object so that the path property is not nil if he had known about both changes.
The sync server knows about both changes. And resolving the conflict by allowing the assignment after the delete still converges on a consistent end state for both users. And that end state is a better one than the current end state, where path on the newly-added AudioClip object is unexpectedly nil.
Alternative
Look, I get it: sync is basically the hardest problem there is. And the Four Laws probably exist as they do for a reason. But this shared-reference pattern is very common and right now there's a giant pit waiting to snare people.
At the very least, if nothing can be done to make sync handle this better, this page should be updated with an example/warning. The current example of "delete vs. modify" is a very trivial one.