[SERVER-56074] Add deleteLookup support on Change Streams Created: 13/Apr/21  Updated: 06/Dec/22  Resolved: 27/Apr/21

Status: Closed
Project: Core Server
Component/s: None
Affects Version/s: None
Fix Version/s: None

Type: Improvement Priority: Minor - P4
Reporter: Allan Zimmermann Assignee: Backlog - Query Execution
Resolution: Duplicate Votes: 0
Labels: feature
Remaining Estimate: Not Specified
Time Spent: Not Specified
Original Estimate: Not Specified

Issue Links:
Related
related to SERVER-36941 Option to provide "before image" with... Closed
Assigned Teams:
Query Execution
Participants:

 Description   

Change Streams is a fantastic feature. As you know the pipeline does not really work, when handling update documents without also adding { fullDocument: 'updateLookup' } option when creating the watch.
This is also the case for deletes thou, could you also add a { fullDocument: 'deleteLookup' } option to watches ?

 



 Comments   
Comment by Katya Kamenieva [ 27/Apr/21 ]

hi allan@openrpa.dk, I think your use case falls under this feature request SERVER-36941, so I'll close it as a duplicate.

I'd like to point out that when you use `full_document="updateLookup"` it embeds the current state of the document at the moment you read the change stream event. 

Example (in mongo shell): 

db.c.insert({a:'a', b: 'b' })
w = db.c.watch([],{ fullDocument : "updateLookup" })
db.c.update({a:'a'}, {$set: {b: 'b1'}})
db.c.update({a:'a'}, {$set: {b: 'b2'}})
db.c.update({a:'a'}, {$set: {b: 'b3'}})
w.pretty()

The very first event where b changed from 'b' to 'b1' will have b: 'b3' in the 'fullDocument', as it is its current value.

{
	"_id" : {...},
	"operationType" : "update",
	"clusterTime" : Timestamp(1619536213, 1),
	"fullDocument" : {
		"_id" : ...,
		"a" : "a",
		"b" : "b3"
	},
	"ns" : { "db" : "test", "coll" : "c"},
	"documentKey" : {...},
	"updateDescription" : {
		"updatedFields" : {
			"b" : "b2"
		},
		"removedFields" : [ ]
	}
}

 So if you are using 

[{ "$match": { "fullDocument._type": "test" } }] 

you are assuming '_type' doesn't change, and if it does you will not get all change events before it changed.

Comment by Edwin Zhou [ 20/Apr/21 ]

Hi allan@openrpa.dk,

Thank you for confirming this use case. We're assigning this ticket to the appropriate team to be evaluated against our currently planned work. Updates will be posted on this ticket as they happen.

Best,
Edwin

Comment by Allan Zimmermann [ 20/Apr/21 ]

Yes, spot on. 

Comment by Edwin Zhou [ 20/Apr/21 ]

Hi allan@openrpa.dk,

Thanks for your clarification.

For your given pipeline to be able to match the output document of the change event, it will need the fullDocument field, since the pipeline matches on the fullDocument field.

I'm currently interpreting:

adding some option when calling watch would allow the pipeline to work for all types

as an option that would add the fullDocument field to all change events, so we can introduce a pipeline that will match on the contents of the deleted document before it's been deleted.

Consider the following use case of change streams as implemented in Python:

import pymongo
 
uri = "mongodb://127.0.0.1:27017/test"
mongo = pymongo.MongoClient(uri)
db = mongo["test"]
coll = db["test_col"]
coll.insert({a:1})
pipeline = [
  {'$match': {'fullDocument.a': 1 }}
]
cursor = coll.watch(pipeline=pipeline,full_document="updateLookup")
document = next(cursor)

where running the command

coll.remove({a:1})

currently doesn't print out the delete event. Your improvement suggests that it should return and print an output document.

Does this accurately represent your use case?

Best,
Edwin

Comment by Allan Zimmermann [ 20/Apr/21 ]

For me it's two fold.

"need to have" would be the ability to allow change stream filters to work for deletes as well.
So if i have the following pipeline

 

[{ "$match":
	{ "fullDocument._type": "test" }
}] 

This works for insert, replace, but not update(documents) and deletes
If I add 

 

{ fullDocument: 'updateLookup' }

it also works for update(documents) but not deletes.
If adding some option when calling watch would allow the pipeline to work for all types, I'm a happy camper.

"Nice to have", would to also have the deleted document in next.fullDocument if I add the mentioned option. In my specific case, I can live with only having the _id but if I could get the full document that was deleted that would be nice too.

Comment by Edwin Zhou [ 20/Apr/21 ]

Hi allan@openrpa.dk,

To clarify your feature suggestion's functionality, would you like to see fullDocument as a change stream output document field when a document is deleted?

In our documentation for change events, we explicitly state the output document of a delete event is omitted as the document no longer exists. Would you like to see the fullDocument field include the document before it is deleted?

For example:

{
   _id: { < Resume Token > },
   operationType: 'delete',
   clusterTime: <Timestamp>,
   ns: {
      db: 'engineering',
      coll: 'users'
   },
   documentKey: {
      _id: ObjectId("599af247bb69cd89961c986d")
   },
   fullDocument: { <Document> }
}

Best,
Edwin

Comment by Allan Zimmermann [ 14/Apr/21 ]

I guess that would be come

{ fullDocument: 'update_and_deleteLookup' }

or something similar .. to get for both

Generated at Thu Feb 08 05:38:15 UTC 2024 using Jira 9.7.1#970001-sha1:2222b88b221c4928ef0de3161136cc90c8356a66.