[SERVER-41559] Can not fetch changed array elements from change streams Created: 06/Jun/19  Updated: 06/Dec/22

Status: Backlog
Project: Core Server
Component/s: Change streams
Affects Version/s: None
Fix Version/s: None

Type: Improvement Priority: Major - P3
Reporter: Han Assignee: Backlog - Query Execution
Resolution: Unresolved Votes: 0
Labels: change-streams-improvements, pm1950-m6
Remaining Estimate: Not Specified
Time Spent: Not Specified
Original Estimate: Not Specified

Issue Links:
Backports
backports JAVA-3228 Watch with project pipeline didn't wo... Closed
Related
is related to SERVER-36941 Option to provide "before image" with... Closed
is related to SERVER-58272 Change Streams for complex nested fields Closed
Assigned Teams:
Query Execution
Participants:
Case:

 Description   

Here are outputs for corresponding steps, What I need is to manipulate the deleted array item, but it seems not workable.

1) First , here is the initial data in the database,

   there're 3 outer array items "test-0, test-1, test-2" with each has 3 inner array items "nest-test-0,nest-test-1,nest-test-3":

 

{
    "_id": "5cf7553cc6b365a5c72f2163",
    "opendaylight-mdsal-binding-test:top": {
        "top-level-list": [
            {
              "name": "test-0",
              "nested-list": [
                  {"name": "nest-test-0", "type": "nest-type-0"},
                  {"name": "nest-test-1", "type": "nest-type-1"},
                  {"name": "nest-test-2", "type": "nest-type-2"}
              ],
              "simple": "simple-case"
            },
            {
              "name": "test-1",
              "nested-list": [
                  {"name": "nest-test-0", "type": "nest-type-0"},
                  {"name": "nest-test-1", "type": "nest-type-1"},
                  {"name": "nest-test-2", "type": "nest-type-2"}
              ],
              "simple": "simple-case"
            },
            {
              "name": "test-2",
              "nested-list": [
                  {"name": "nest-test-0", "type": "nest-type-0"},
                  {"name": "nest-test-1", "type": "nest-type-1"},
                  {"name": "nest-test-2", "type": "nest-type-2"}
              ],
              "simple": "simple-case"
            }
        ]
    }
}

2) Next , I pull one nested array item by calling 'collection.updateOne(...)' where 'Update' and 'UpdateOptions' like:

Update{fieldName='opendaylight-mdsal-binding-test:top.top-level-list.$[item0].nested-list', operator='$pull', value=Document{{name=nest-test-2}}}
UpdateOptions{upsert=true, bypassDocumentValidation=null, collation=null, arrayFilters=[And Filter{filters=[Filter{fieldName='item0.name', value=test-0}]}]}

It's clear above that I specify the '$[item0]' with 'Filter{fieldName='item0.name', value=test-0}'  to delete the inner array item 'nest-test-2' from outer array item 'test-0'.

3) After step 2), finally I recieve the change event:

ChangeStreamDocument {
    resumeToken = {
        "_data":
            "825CF75C47000000022B022C0100296E5A10042A6A4D3FE1F64145A8D27037F0BA45BD46645F696400645CF75C47D4CE6723E929EA0B0004"
    },
    namespace = configuration.urn: opendaylight: params: xml: ns: yang: mdsal: test: binding @2014 -
        07 - 01,
    fullDocument = Document{
        {
          _id = 5cf75c47d4ce6723e929ea0b,
          opendaylight - mdsal - binding -
          test:
              top = Document{
                  {top - level - list =
                       [
                         Document{
                             {name = test - 0,
                              nested - list =
                                  [
                                    Document{{name = nest - test - 0, type = nest - type - 0}},
                                    Document{{name = nest - test - 1, type = nest - type - 1}}
                                  ],
                              simple = simple - case}
                         },
                         Document{
                             {name = test - 1,
                              nested - list =
                                  [
                                    Document{{name = nest - test - 0, type = nest - type - 0}},
                                    Document{{name = nest - test - 1, type = nest - type - 1}},
                                    Document{{name = nest - test - 2, type = nest - type - 2}}
                                  ],
                              simple = simple - case}
                         },
                         Document{
                             {name = test - 2,
                              nested - list =
                                  [
                                    Document{{name = nest - test - 0, type = nest - type - 0}},
                                    Document{{name = nest - test - 1, type = nest - type - 1}},
                                    Document{{name = nest - test - 2, type = nest - type - 2}}
                                  ],
                              simple = simple - case}
                         }
                       ]}
              }
        }
    },
    documentKey = {"_id": {"$oid": "5cf75c47d4ce6723e929ea0b"}},
    clusterTime = Timestamp{value = 6698924430749335554, seconds = 1559714887, inc = 2},
    operationType = OperationType{value = 'update'}, updateDescription = UpdateDescription {
        removedFields = [], updatedFields = {
            "opendaylight-mdsal-binding-test:top.top-level-list.0.nested-list": [
                {"name": "nest-test-0", "type": "nest-type-0"},
                {"name": "nest-test-1", "type": "nest-type-1"}
            ]
        }
    }
}

From the output we can see that it's an 'update' operation type and the 'updatedFields' is :

{"opendaylight-mdsal-binding-test:top.top-level-list.0.nested-list": [{"name": "nest-test-0", "type": "nest-type-0"}, {"name": "nest-test-1", "type": "nest-type-1"}]}}

Note 'updatedFields' above just provides data after '$pull', but I really need to see detail deleted data information "what I delete from where", an idealy output I would prefer is like kind of :

{"opendaylight-mdsal-binding-test:top.top-level-list.0.nested-list": [{
"array-filter": {"0": {"name": "test-0"}},
"pulled":{"name": "nest-test-2", "type": "nest-type-2"},
"current": {"name": "nest-test-0", "type": "nest-type-0"}, {"name": "nest-test-1", "type": "nest-type-1"}]}}
}

In this way , with 'array-filter' and 'pulled' fields, I could get the deleted data and notify listeners just like 'DataTreeChangeService' does:

https://github.com/opendaylight/alt-datastores/blob/master/mongodb/yongo/src/main/java/org/opendaylight/datastore/yongo/impl/YongoStream.java#L155

 

 

 



 Comments   
Comment by Eric Sedor [ 10/Jun/19 ]

Thanks for the additional information, jiehan2019; I'm passing this on to an appropriate team for consideration. Please watch this ticket for updates.

Comment by Han [ 10/Jun/19 ]

Hi @Eric,
The full contents of the original array is required too, but that's not enough as actually we need to know what exactly has been removed as
we want to clean up the corresponding instance or process. If there's only full original contents, that will be a disaster for us to get these data through like merge/compare if it's possiable.
In OpenDaylight, there're APIs like 'getDataBefore', 'getDataAfter' which we could use to get removed elements/keys as well as contents before and after CRUD in an data tree change event, see:
DataTreeCandidateNode
So requirements could at least include:
1) the path of the modified/updated field
2) contents of modified/updated field before and after (for this, if 1) is satisfied, then we could use the path to get from the provided original contents through projection if the full contents is provided by server.)

As a result, for example above,
1. the path for deletion could be:

"opendaylight-mdsal-binding-test:top": {"top-level-list": [{"name": "test-0"[{"name": "nest-test-2"}

2. result of 'getDataBefore':

{"name": "nest-test-2", "type": "nest-type-2"}

3. result of 'getDataAfter':

null

Comment by Eric Sedor [ 06/Jun/19 ]

Hi jiehan2019;

Some of what you are asking for would be provided with SERVER-36941. But it sounds like you are specifically requesting that removed elements of an array be indicated, and are not asking for the contents of the array before the update. Is that right?

Would the full contents of the original array satisfy your requirements?

Generated at Thu Feb 08 04:58:04 UTC 2024 using Jira 9.7.1#970001-sha1:2222b88b221c4928ef0de3161136cc90c8356a66.