[SERVER-58272] Change Streams for complex nested fields Created: 05/Jul/21  Updated: 27/Oct/23  Resolved: 08/Aug/21

Status: Closed
Project: Core Server
Component/s: None
Affects Version/s: None
Fix Version/s: None

Type: Improvement Priority: Major - P3
Reporter: Stan Yeshchenko Assignee: Bernard Gorman
Resolution: Works as Designed Votes: 0
Labels: change-streams-improvements
Remaining Estimate: Not Specified
Time Spent: Not Specified
Original Estimate: Not Specified

Issue Links:
Related
related to SERVER-41559 Can not fetch changed array elements ... Backlog
related to SERVER-36941 Option to provide "before image" with... Closed
is related to SERVER-47140 Use diffing for full document replace... Backlog
Sprint: Query Execution 2021-07-26, QE 2021-08-09, QE 2021-08-23
Participants:

 Description   

Is there a way to determine what exactly was changed on the nested field of the document? 

One of the fields in my document is "branding" (shown below). If a change occurs to a single field under "branding", the change stream will return an entire "branding" object and there is no way to know which exact sub-field was affected. 

"branding": {
        "control_panel": {
            "primaryFont": {
                "name": "c713b99e-40d4-4d31-629a-6def5f7963c6.ttf",
                "displayName": "Arial",
                "fileType": "tff"
            },
            "secondaryFont": {
                "name": "c713b99e-40d4-4d31-629a-6def5f7963c6.ttf",
                "displayName": "Arial",
                "fileType": "tff"
            },
            "companyLogo": {
                "name": "18c0c0ef-5d4c-4a3e-b16c-a8f640c3e2be.png",
                "displayName": "1200px-Logo_NIKE.svg",
                "fileType": "png"
            },
            "backgroundImage": {
                "name": "8d1cc393-0132-4a3f-9b2a-d5a9a68bd4e4.jpg",
                "displayName": "s23-slams_original-web",
                "fileType": "jpg"
            },
            "primaryColor": "ffffff",
            "secondaryColor": "ffffff",
            "primaryFontWeight": "bold",
            "secondaryFontWeight": "regular",
            "publishSchedule": "AUTOMATIC",
            "publishScheduleStatus": "UNPUBLISHED",
            "videoRecording": "DEFAULT",
            "videoRecordingLink": null,
            "hideVideoRecording": false,
            "hideMeetingReplay": false
        },
        "registration_page": {
            "openRegistration": "15",
            "proxyWebsiteLink": null,
            "termsAndConditionsUrl": null,
            "title": "New Event - test copy",
            "description": ""
        },
        "pre_event_page": {
            "message": "",
            "lobbyMessage": "",
            "musicOption": "DEFAULT"
        },
        "thank_you_page": {
            "title": "test event",
            "description": ""
        },
        "post_event_page": {
            "meetingDetails": ""
        }
    },

 This makes change streams useless unless I apply custom business logic and store all the changes and then perform JSON diff myself. 

 

Is there a better way? 



 Comments   
Comment by Bernard Gorman [ 08/Aug/21 ]

Hi stany@q4inc.com,

I believe the behaviour you're seeing is due to the difference between an update and a replace operation. MongoDB has two forms of update; a delta-update and a full-document replacement.

Replacements look like this:

db.coll.update({query for matching document}, {full replacement document})
OR
db.collection.replaceOne({query for matching document}, {full replacement document})

Normal (delta) updates look like this:

db.coll.update({query for matching document}, {$set: {individualField: newValue}})

If your application is performing full replacements, then the complete replacement document will be recorded in the oplog. Since change streams simply reports what it sees in the oplog, this will produce a change stream event with {operationType: "replace"} and a complete copy of the new document.

If, however, you do a normal update, then only the fields that were changed will be reported; you will receive a change stream event with {operationType: "update"} and a field called updateDescription which reports the fields that were modified by the operation. Please see this page for details of the update and replace events, and the updateDescription field.

For instance, if I insert the sample document you provided above and then run the following update:

db.testing.updateOne({}, {$set: {"branding.control_panel.companyLogo.displayName": "1200px-Logo_NIKE_updated.svg"}})

... then I receive the following change event (abbreviated for clarity):

{
	"_id" : ...,
	"operationType" : "update",
	"clusterTime" : ...,
	"ns" : ...,
	"documentKey" : ...,
	"updateDescription" : {
		"updatedFields" : {
			"branding.control_panel.companyLogo.displayName" : "1200px-Logo_NIKE_updated.svg"
		},
		"removedFields" : [ ]
	}
}

One other thing to be aware of: MongoDB 4.2 introduced a new type of update, called an expressive or "pipeline" update. These look like the following:

db.coll.update({query for matching document}, [{$set: {individualField: "$$NOW"}}])

Note that this is very similar to the normal update, except that this type encloses the $set operation in a pair of brackets [] to indicate that this is a pipeline, which allows variables like $$NOW and other expressions to be used in the update. In versions 4.2 and 4.4, pipeline updates will ALWAYS produce a full-document replacement operation in the oplog. If your application is using pipeline-style updates, this would explain the behaviour you're seeing. In version 5.0 of MongoDB, we have added the ability for pipeline updates to generate a diff in the oplog rather than a full replacement, so change streams will begin reporting {operationType: "update"} and updateDescription for these operations.

Hope this helps!

Best regards,
Bernard

Comment by Eric Sedor [ 12/Jul/21 ]

Thanks stany@q4inc.com,

I'll pass this on to an appropriate team to consider. This request is very similar to SERVER-41559, but SERVER-41559 seems more focused on the contents of arrays, whereas this ticket is focused on Subdocuments. It may be that these requests end up getting merged, but I will defer that decision to more knowledgeable engineers.

Gratefully,
Eric

Generated at Thu Feb 08 05:44:04 UTC 2024 using Jira 9.7.1#970001-sha1:2222b88b221c4928ef0de3161136cc90c8356a66.