[SERVER-55866] Is it possible to move WiredTiger files to different file system? Created: 07/Apr/21  Updated: 06/Dec/22  Resolved: 08/Apr/21

Status: Closed
Project: Core Server
Component/s: None
Affects Version/s: None
Fix Version/s: None

Type: Question Priority: Minor - P4
Reporter: Wernfried Domscheit Assignee: Backlog - Triage Team
Resolution: Done Votes: 0
Labels: None
Remaining Estimate: Not Specified
Time Spent: Not Specified
Original Estimate: Not Specified

Assigned Teams:
Server Triage
Participants:

 Description   

I have a MongoDB Sharded cluster with a hybrid storage, i.e. some fast SSD and some slower and cheaper spinning rust.

For archiving I like to move some data to the slower disc. For legal reason we have to keep them, they are queried only occasionally.

In principle I would do it like this:

mongo --eval "sh.stopBalancer()" mongos-host:27017
 
# Repeat below on each shard host:
mongo --eval "db.fsyncLock()" localhost:27018
 
cp /mongodb/data/collection/3109--6926861682361166404.wt /slow-disc/mongodb/collection/3109--6926861682361166404.wt
ln --force --symbolic /mongodb/data/collection/3109--6926861682361166404.wt /slow-disc/mongodb/collection/3109--6926861682361166404.wt
 
mongo --eval "db.fsyncUnlock()" localhost:27018
 
# After all shards are done:
mongo --eval "sh.startBalancer()" mongos-host:27017

The indexes shall remain on the fast disc.

Would this be a reliable way to archive my data? What happens if the collection is read while move?

 

Another approach would be a file system like this:

/mongodb/data/collection
/mongodb/data/index
/mongodb/archive/collection -> /slow-disc/mongodb/collection 
/mongodb/archive/index

And then move the collection as this:

mongo --eval 'sh.shardCollection("archive.coll", shardKey)' mongos-host:27017
mongodump --uri "mongodb://mongos-host:27017" --db=data --collection=coll --archive=- | mongorestore --uri "mongodb://mongos-host:27017" --nsFrom="data.coll" --nsTo="archive.coll" --archive=-
mongo --eval 'db.getSiblingDB("data").getCollection("coll").drop()' mongos-host:27017

Main disadvantage: the balancer has to distribute the whole data across the shards. It creates additional load on my shared cluster.

Which approach would you recommend?

 

 

 

 

 



 Comments   
Comment by Dmitry Agranat [ 08/Apr/21 ]

Hi wernfried.domscheit@sunrise.net,

The SERVER project is for bugs and feature suggestions for the MongoDB server. For general questions, we'd like to encourage you to start by asking our community for help by posting on the MongoDB Developer Community Forums.

If the discussion there leads you to suspect a bug in the MongoDB server, then we'd want to discuss it here in the SERVER project.

Regards,
Dima

Generated at Thu Feb 08 05:37:41 UTC 2024 using Jira 9.7.1#970001-sha1:2222b88b221c4928ef0de3161136cc90c8356a66.