[SERVER-37814] db.fsyncLock() do not prevent writes to database on a separate filesystem Created: 24/Oct/18  Updated: 06/Feb/19  Resolved: 08/Nov/18

Status: Closed
Project: Core Server
Component/s: Storage
Affects Version/s: 3.6.8
Fix Version/s: None

Type: Bug Priority: Major - P3
Reporter: Piotr Rybicki Assignee: Danny Hatcher (Inactive)
Resolution: Cannot Reproduce Votes: 0
Labels: None
Remaining Estimate: Not Specified
Time Spent: Not Specified
Original Estimate: Not Specified

Issue Links:
Related
related to DOCS-12429 Incorrect description of behavior of ... Closed
Operating System: ALL
Participants:

 Description   

I' trying to develop Mongodb filesystem snapshots.

I have 2 filesystems (XFS)

  • /mongodb (main dir)
  • /database (one database on a separate filesystem - for performace, symlinked at /mongodb/database)

 

I tried scenario decribed at https://docs.mongodb.com/manual/tutorial/backup-with-filesystem-snapshots/:

1) db.fsyncLock()

2) tar /mongodb

3) tar /database

4) db.fsyncUnlock()

 

All is done on a 4th node of reaplica set, that is :

 

"arbiterOnly" : false,
"buildIndexes" : true,
"hidden" : true,
"priority" : 0,
"tags" : {},
"slaveDelay" : NumberLong(0),
"votes" : 1

 

I discovered, that even after running db.fsyncLock() (and 100% ensured it is active by running db.CurrentOp() ), tar command complains about files in /database being changed at doing backup. Modification times of a complained files are a log after db.fsyncLock() was executed, so tar is right about those files changed. I also see some writes to this /database filesystem at the time tar was running (by watching dstat output)

 

I'm I missing something here?

 

Best regards

Piotr Rybicki



 Comments   
Comment by Danny Hatcher (Inactive) [ 08/Nov/18 ]

After further attempts, I am unable to repro the problem. When using 3.6.8, the `db.fsyncLock()` command freezes all writes to the hidden node:

Hatcher:test danielhatcher$ ls -l data/replset/rs3/db/foo/*
-rw-------  1 danielhatcher  staff    345 Nov  8 11:44 data/replset/rs3/db/foo/collection-25--5574008208168300515.wt
-rw-------  1 danielhatcher  staff  16384 Nov  8 11:40 data/replset/rs3/db/foo/index-26--5574008208168300515.wt
Hatcher:test danielhatcher$ ls -l data/replset/rs1/db/foo/*
-rw-------  1 danielhatcher  staff  36864 Nov  8 11:53 data/replset/rs1/db/foo/collection-19-4977384934404339527.wt
-rw-------  1 danielhatcher  staff  36864 Nov  8 11:53 data/replset/rs1/db/foo/index-20-4977384934404339527.wt

As we are unable to reproduce, I will close the ticket as "Cannot Reproduce" for now. If anyone experiences this issue again, please leave a comment here with details of your environment and we will investigate further.

Comment by Danny Hatcher (Inactive) [ 01/Nov/18 ]

Hello Piotr,

Thank you for your response. I'll keep trying to reproduce the problem. Please do not hesitate to let us know if you do have a chance to test.

Have a great day,

Danny

Comment by Piotr Rybicki [ 30/Oct/18 ]

Thank You Daniel.

I solved this issue indeed, by stopping mongod process. In the end, I don't see any important advantage over freezing - either way mongod has to apply oplog, that is accumulated during offline/freeze. Adding 4th hidden node to the RS gets files backup job done without any additional impact on a running RS.

I'm currently on 3.6.8. Plan to migrate to 4.0 till end of this year. I can test it then, but i suppose just for curiosity really.

And yes - i'm using  storage.directoryPerDB: true and also directoryForIndexes: true

Best regards

Piotr Rybicki

Comment by Danny Hatcher (Inactive) [ 30/Oct/18 ]

Hello Piotr,

I greatly apologize; there was a miscommunication on our end about the status of this ticket. You are correct that this appears to be a bug or an undocumented aspect of the system.

Can you please elaborate on the following? I have been unable to reproduce the problem through some quick testing but I may not be sufficiently replicating your environment.

  • You mention that you solved the issue. Did you do so by simply stopping the process to tar the files or did you follow some other method?
  • Which version did you experience this issue in? If it was not 4.0.3, are you able to reproduce on that version?
  • By "one database on a separate filesystem", are you using our "directory per database" feature or some other implementation to reach the desired result?

I apologize again for the confusion.

Danny

Edit: Ah, I see that you reported 3.6.8 as the initial problem version. Would it be possible for you to test against an instance of 4.0.3?

Comment by Piotr Rybicki [ 29/Oct/18 ]

As I said before, I don't need support and I'm reporting a bug.

I've already managed to solve this issue by myself.

If You really do not consider this as a bug (at least misinformation in doc saying that db.rsyncLock() will lock all writes - which is not always true), than please close this report with apropriate status.

 

Kind Regards,

Piotr Rybicki

Comment by MongoDB Support [ 29/Oct/18 ]

Hello,

This is an automated reminder that this ticket is awaiting your response. Please respond to this ticket so that we can continue working on this issue with you, or let us know if the ticket can be resolved.

Thanks,
MongoDB Support

Comment by Karl Denby [ 26/Oct/18 ]

Hi Piotr,

After consulting with the server team, they believe this is not a bug and it should be treated as a support issue.

It appears that you do not currently have an active subscription with access to our core database support team, that provide support on issues like this one. Support is offered via the MongoDB Support Portal (support.mongodb.com), which includes access to create cases, our knowledge base, and more. If this is something that you are interested in, I'd be happy to put you in touch with your MongoDB Account Executive to discuss our subscription option:

This option would provide you with a guaranteed level of support response from MongoDB engineers, and a commitment to work through all support issues you have from beginning to end so that you are successful with MongoDB.

Please let me know if you would like me to put you in touch with a MongoDB Account Executive to discuss our subscription offerings.

Kind Regards,
Karl

Comment by Piotr Rybicki [ 26/Oct/18 ]

Thank You Karl,

 

but I think I don't need support. I just discovered situation where mongodb server does not behave as documented (db.fsyncLock() does not guarantee database locking - apart from diagnostic.data). And I believe it is a bug, so I'm reporting it here

 

Best regards

Piotr Rybicki

Comment by Karl Denby [ 25/Oct/18 ]

Hi Piotr,

We noticed that you created this issue in the SERVER JIRA project, and have moved it to the SUPPORT JIRA project so we can further determine if you have a support subscription.

Support is offered via the MongoDB Support Portal (support.mongodb.com), which includes access to create cases, our knowledge base, and more. I believe you have used it to create cases related to your Cloud Manager Project in the past.

Alternatively, our mongodb-user community forum is the best place to ask your query. The MongoDB team tries to help out in this forum, and in addition other community members might be able to share their experience.

Please create a case via the support portal so that we may assist you.

Kind Regards,
Karl

Comment by Piotr Rybicki [ 25/Oct/18 ]

There is also a backup agent from Mongo Cloud Manager used to do backups of RS, if that could be of some hint here

Comment by Piotr Rybicki [ 25/Oct/18 ]

https://jira.mongodb.org/browse/SERVER-32132  is another case

In my scenario, I have separate filesystem /database (with collection and index sub-dirs in it), and those files are changed. There is no diagnostic.data or journal or local directory there.

I believe that journal was also changed during that db.fsyncLock() - I'll check that later

Comment by Danny Hatcher (Inactive) [ 24/Oct/18 ]

Hello Piotr,

Per SERVER-32132, the diagnostic.data folder will contain active writes to disk even if the db.fsyncLock() command has been run. If you exclude that folder from your backup, are you still seeing complaints above file changes?

Generated at Thu Feb 08 04:47:06 UTC 2024 using Jira 9.7.1#970001-sha1:2222b88b221c4928ef0de3161136cc90c8356a66.