[SERVER-45511] Data loss following machine PowerOff with writeConcernMajorityJournalDefault true Created: 12/Jan/20 Updated: 15/Jan/20 Resolved: 15/Jan/20 |
|
| Status: | Closed |
| Project: | Core Server |
| Component/s: | None |
| Affects Version/s: | 4.2.0 |
| Fix Version/s: | None |
| Type: | Bug | Priority: | Major - P3 |
| Reporter: | Mark Berg | Assignee: | Danny Hatcher (Inactive) |
| Resolution: | Done | Votes: | 0 |
| Labels: | None | ||
| Remaining Estimate: | Not Specified | ||
| Time Spent: | Not Specified | ||
| Original Estimate: | Not Specified | ||
| Issue Links: |
|
||||||||||||
| Operating System: | ALL | ||||||||||||
| Participants: | |||||||||||||
| Description |
|
Background:
The Test: Before implementing the cluster on the production environment, I've conducted several "stress tests". I have created a simple script that performs many inserts to the cluster and returns the amount of successful inserts. When I run and stop the script everything is just fine. The number of inserts_count is identical to the count of documents in the collection. BUT, When I run the script and then PowerOff the Primary member, I'm facing a hitch. My script's insert_count is bigger (10-20) than the count of documents in my collection. I assume that I'm losing data. I got successful insert acknowledge even though my replicaSet is set writeConcernMajorityJournalDefault true. Raising the primary doesn't help to retrieve the lost data. I think the data was still in memory! Conclusion: I believe that there is some malfunction with the journaling setting. P.S: I tried to insert with _ {w: majority, j: true, wtimeout: 5000}_ parameters. Regards, |
| Comments |
| Comment by Danny Hatcher (Inactive) [ 15/Jan/20 ] |
|
I'm glad you were to able to discover the problem. I'll close this ticket. |
| Comment by Mark Berg [ 15/Jan/20 ] |
|
Issue solved! A bit Google and found this page https://api.mongodb.com/python/current/migrate-to-pymongo3.html#the-write-concern-attribute-is-immutable. Underline: I have used Pymongo 3.4.0 while trying to run write_concern like old versions. |
| Comment by Danny Hatcher (Inactive) [ 13/Jan/20 ] |
|
While it is possible that there is a bug, the example you described is a very common use case. Do the inserts you are performing have a monotonically increasing field? That is, do they have a field that increases by 1 for each individual insert? If so, you should be able to tell if there are any gaps in the expected result set actually on the nodes. Are you inserting across the shards or are all the inserts going to one shard? Do different nodes in a given shard have a different document count? If you can provide the full script you are running along with the results I can take a look. |