[SERVER-15726] Improve replication of $slice commands Created: 20/Oct/14  Updated: 26/Nov/18  Resolved: 26/Nov/18

Status: Closed
Project: Core Server
Component/s: Replication
Affects Version/s: 2.6.5
Fix Version/s: None

Type: Improvement Priority: Major - P3
Reporter: Nikolay Grebnev Assignee: Asya Kamsky
Resolution: Done Votes: 0
Labels: None
Remaining Estimate: Not Specified
Time Spent: Not Specified
Original Estimate: Not Specified

Participants:

 Description   

We use slice for users activity feed limitation at website.
Now we use the limit 10 000, herewith enough quantity of data is stored, so total amount is 1350 kb approximately. While using replica set we met the following problem that there is big traffic to secondary nodes. Log analysis showed that while using slice object is always transferred totally via Internet.

2014-10-06T10:12:33.796+0000 [repl writer worker 2] warning: log line attempted (1349k) over max size (10k), printing beginning and end ... applying op: { ts: Timestamp 1412590353000|6, h: -4609969471965806062, v: 2, op: "u", ns: "dm_social.feed_activity", o2: { _id: 1 }, o: { $set: { log2: [ { time: "1410172122", user_id: 100850514, name: "NNNN", sex: 2, username: "praisss", age: 26, country: "NNNN", city: "NNNN" }, { time: "1410172123", user_id: 100918283, name: "NNNN", sex: 2, username: "lydok0708", age: 35, country: "NNNN", city: "NNNN" }, { time: "1410172123", user_id: 101119576, name: "NNNN", sex: 2, username: "delfin51", age: 59, country: "NNNN", city: "NNNN" }, { time: "1410172129", user_id: 100908179, name: "NNNN", sex: 2, username: "lav13", age: 54, country: "NNNN", city: "NNNN" }, 

I would like to ask to improve slice function, so, when it is used, traffic for replicas would be the same as without slice being used.
Currently we solved the issue by using slice with probability of 1/100



 Comments   
Comment by Asya Kamsky [ 26/Nov/18 ]

As long as our replication relies on all operations being recorded in idempotent format, this cannot be changed.

 

Comment by Asya Kamsky [ 26/Nov/18 ]

The issue is you are using $slice with negative offset which means we keep the last N elements of the array after the $push.

When that does not change the number of elements in the array, we detect it as a "noop" however, when this causes elements at the beginning of the array to be removed then each remaining element in the array changes position!   So we have to record the new position of each element to make the operation idempotent, and therefore we cannot represent this operation in any other way.

 

Comment by Ramon Fernandez Marina [ 18/Mar/15 ]

ngrebnev, this is the current, expected behavior for $slice. I'm going to re-purpose this ticket as an improvement request as per your initial suggestion on the ticket.

Regards,
Ramón.

Comment by Ramon Fernandez Marina [ 10/Jan/15 ]

Hi ngrebnev, apologies for the long delay. I do see the behavior you describe, and I'm investigating whether it is expected or not, and if it is whether there's room for improvement.

It's easy to see how a $push operation would only need to send the pushed elements to the secondaries, but a $slice operation may change an array significantly so it could be that the only way to replicate such operation is to send the whole array to the secondaries.

This being said, having very large arrays in documents is often not a good idea, and if you're keeping logs of some kind you may want to investigate the use of capped collections.

Comment by Nikolay Grebnev [ 20/Oct/14 ]

$mongo['feed_activity'].update(

{'_id'=>1}

, {'$push'=>{log_name=>

{'$each'=>[info_hash],'$slice'=>-10000}

}},{:upsert=>true})

Generated at Thu Feb 08 03:38:47 UTC 2024 using Jira 9.7.1#970001-sha1:2222b88b221c4928ef0de3161136cc90c8356a66.