Loading...

XML

Word

Printable

JSON

Type: Question
Resolution: Done
Priority: Minor - P4
Fix Version/s: None
Affects Version/s: None
Component/s: None
Labels:
None

Assigned Teams:

Server Triage
Confidence Status:
None
Work Order:
3
CAR Domain/s:
None

Aha! Reference:
None
Tracking Level:
None
Risk Status:
None
Exec Notes:
None
Goal Name(s):
None
Goal Link:
None

I have a MongoDB Sharded cluster with a hybrid storage, i.e. some fast SSD and some slower and cheaper spinning rust.

For archiving I like to move some data to the slower disc. For legal reason we have to keep them, they are queried only occasionally.

In principle I would do it like this:

mongo --eval "sh.stopBalancer()" mongos-host:27017

# Repeat below on each shard host:
mongo --eval "db.fsyncLock()" localhost:27018

cp /mongodb/data/collection/3109--6926861682361166404.wt /slow-disc/mongodb/collection/3109--6926861682361166404.wt
ln --force --symbolic /mongodb/data/collection/3109--6926861682361166404.wt /slow-disc/mongodb/collection/3109--6926861682361166404.wt

mongo --eval "db.fsyncUnlock()" localhost:27018

# After all shards are done:
mongo --eval "sh.startBalancer()" mongos-host:27017

The indexes shall remain on the fast disc.

Would this be a reliable way to archive my data? What happens if the collection is read while move?

Another approach would be a file system like this:

/mongodb/data/collection
/mongodb/data/index
/mongodb/archive/collection -> /slow-disc/mongodb/collection 
/mongodb/archive/index

And then move the collection as this:

mongo --eval 'sh.shardCollection("archive.coll", shardKey)' mongos-host:27017
mongodump --uri "mongodb://mongos-host:27017" --db=data --collection=coll --archive=- | mongorestore --uri "mongodb://mongos-host:27017" --nsFrom="data.coll" --nsTo="archive.coll" --archive=-
mongo --eval 'db.getSiblingDB("data").getCollection("coll").drop()' mongos-host:27017

Main disadvantage: the balancer has to distribute the whole data across the shards. It creates additional load on my shared cluster.

Which approach would you recommend?

Assignee:: [HELP ONLY] Backlog - Triage Team
Reporter:: Wernfried Domscheit
Participants:: [HELP ONLY] Backlog - Triage Team, Dmitry Agranat, Wernfried Domscheit
Votes:: 0 Vote for this issue
Watchers:: 3 Start watching this issue

Created:: Apr 07 2021 09:33:10 AM UTC
Updated:: Dec 06 2022 01:26:46 AM UTC
Resolved:: Apr 08 2021 12:20:30 PM UTC

Details

Description

Attachments

Activity

People

Dates