[SERVER-45812] oplog contains a lot of no-ops Created: 27/Jan/20 Updated: 27/Oct/23 Resolved: 03/Feb/20 |
|
| Status: | Closed |
| Project: | Core Server |
| Component/s: | None |
| Affects Version/s: | None |
| Fix Version/s: | None |
| Type: | Question | Priority: | Major - P3 |
| Reporter: | Maksim Kondratyuk | Assignee: | Andy Schwerin |
| Resolution: | Works as Designed | Votes: | 0 |
| Labels: | None | ||
| Remaining Estimate: | Not Specified | ||
| Time Spent: | Not Specified | ||
| Original Estimate: | Not Specified | ||
| Participants: |
| Description |
|
Good evening, I am investigating a reducing of our production oplog duration (covered hours) and found that we have a lot of "no-ops" inside. Sometimes the number really match the amount of updates. We are using replica set and NodeJS ( "mongoose": "^5.7.12" -> "mongodb": "3.4.1"). this is a result of findout first collection:
second collection:
but in general I would say ~90% of collections are affected based on the data. The example of two entries in oplog are bellow first doc:
second doc:
some useful info: It would be nice to understand what is the real issue here, because oplog duration dropped to so low number. |
| Comments |
| Comment by Andy Schwerin [ 03/Feb/20 ] |
|
Sorry. I re-read your description, and since you're not using sharding, there is no chunk migration to worry about. Retryable inserts should not generate any "n" oplog entries. Only updates and deletes. That's consistent with the count of "n" entries compared to "i" and "d" entries on the collections for which you reported statistics. "n" entries can also be generated with a different form during certain other occasions, so having "n" somewhat greater than the sum of "u" and "d" is not surprising. I'm going to close this ticket, but you are welcome to re-open it by adding further comments. |
| Comment by Maksim Kondratyuk [ 31/Jan/20 ] |
|
I am not sure about chunk migration (don't really know what is all about), but this entries in example are exact from production DB. I only change templateKey and userId values. I was wondering about why they have same _id (ObjectId). It means (with high probability) they were created by exact same thread at the same point of time. But you are right they have different ts. In the doc it specified that inserts are also retryable, so maybe it's just a case. Because of network issue driver immediately repeat the insert operation which could lead to that state. And yes in most cases number of "n" really close to number of "u". Sometimes just a little bit bigger, probably because of different case. Thanks for help, I will try to reproduce this findOneAndUpdate case and check what we could do about it. |
| Comment by Andy Schwerin [ 31/Jan/20 ] |
|
I'm not convinced that the "n" and "i" operations you sent are correlated. If they are, the prevOpTime.ts field of one should match the ts field of the other. It's possible that you might get "n" and "i" operations that correspond due to chunk migration, but you'd expect to see the value fromMigrate: true in those oplog entries. |
| Comment by Maksim Kondratyuk [ 31/Jan/20 ] |
|
Ok, thanks, that's become clear. But what about example from the question itself. It's was an insert: "i". Could it be the same issue ? In general we have many operation which are doing a lock on document - with findOneAndUpdate and changing a state from "new" to "processing" - to prevent several threads to process this item. But there are also duplicates exists for "i". Nut probably much less. Thanks for help. i will play around a little bit after weekends and do more analyze on this. We for sure have some collections - which are used as append only storage, will check if they are also affected. |
| Comment by Andy Schwerin [ 31/Jan/20 ] |
|
Sorry for the confusion, Maksim. I'm afraid that findOneAndUpdate with retryable writes enabled always generates the extra oplog entry. The no-op entry contains either what the document looked like before the update took place or after the update took place, depending on the setting of returnOriginal. If you were able to switch to using updateOne instead, you would no longer see the extra oplog entries. It's possible there's some other cause for the change in your effective oplog duration, but all the evidence so far indicates that it's the retryable findAndUpdate operations. |
| Comment by Maksim Kondratyuk [ 31/Jan/20 ] |
|
Hi Andy, we upgraded from 3.6, but we have never enabled retryable writes before. I think they were disabled by default till version 4.2. So if I understood you right, it means, if we would stop calling findOneAndUpdate(...., { returnOriginal: false } ), that should reduce a problem in general ? Is this what you meant by pre / post image document ? Thanks for help. |
| Comment by Andy Schwerin [ 31/Jan/20 ] |
|
There will always be two entries per findAndUpdate when retryable writes are enabled in our current implementation. To clarify, you upgraded to 4.2 from what prior version of MongoDB? I'm surprised to hear that the impact on the oplog is different in 4.2 vs prior versions, unless the prior version is 3.4 or before, as those releases did not support retryable writes. If your application doesn't need to see the modified document from the findAndUpdate you could switch to the form of update which does not return either the pre- or post-image document. In that mode, the extra oplog entry is not required. If you application does need the pre- or post-image of the update, you may need to consider a larger oplog. |
| Comment by Maksim Kondratyuk [ 28/Jan/20 ] |
|
Hi Andy. Thanks for explanation. Yes we have many cases when we are doing findAndUpdate, mostly of one document at a time (findOneAndUpdate). But example which i provided has insert operation, but i am expecting that this is the same behavior. Is it always be a 2 entries per findAndUpdate when retryable writes are enabled ? Or only in case of network issue on real "retry" and this is an indication that we have some network issues between app+db or rs members ? And maybe you can give some hints how we could improve this, if it really possible. Disabling retryable writes and increasing oplog size I would consider as a last possible options. I think on v3.6 we sometimes experienced replication timeouts, we are mostly using w=majority (2 out of 3) and wtimeout=5, but i don't remember that we have a lot of them, |
| Comment by Andy Schwerin [ 28/Jan/20 ] |
|
maksim.kondratyuk@gmail.com, those n oplog entries probably pair with u oplog entries that correspond to find-and-modify operations. These are here to support retryable writes when the writes are findAndModify operations. Can you confirm that your application uses {findAndModify} (sometimes called {findAndUpdate})? |