I am mainly hosting Jellyfin, Nextcloud, and Audiobookself. The files for these services are currently stored on a 2TB HDD and I don’t want to lose them in case of a drive failure. I bought two 12TB HDDs because 2TB got tight and I thought I could add redundancy to my system, to prevent data loss due to a drive failure. I thought I would go with a RAID 2 (or another form of RAID?), but everyone on the internet says that RAID is not a backup. I am not sure if I need a backup. I just want to avoid losing my files when the disk fails.
How should I proceed? Should I use RAID2, or rsync the files every, let’s say, week? I don’t want to have another machine, so I would hook up the rsync target drive to the same machine as the rsync host drive! Rsyncing the files seems to be very cumbersome (also when using a cron job).
Everyone repeat with me: RAID IS NOT BACKUP.
RAID is for maintaining availability and reducing downtime in the event of a drive failure.
Take a look at restic for backup.
Raid can protect you from a single drive failure in case you need an “always on” setup. Even then, if the drives are identical, they can fail within days from each other. If you don’t have monitoring, you’ll lose everything before you can react. I feel that’s not your use case.
You need backup. You can use something like rsync or even better borg backup. Keep the backup offline and backup often. You’ll be safer that way.
Thanks for the advice. Do you have suggestions how to setup/handle the backup? E.G. manually connecting the drive via USB and cloning the files via rsync/ borg, e.g. every week or every time a threshold of changes have been made? Or having a small extra machine with the backup hard drive and sending the files via the network?
I am also still a bit confused. I have 2x 12TB. Lets say I have 6TB files on my hosting drive. AFAICT can I have two backups/snapshots before the third backup needs to override the first backup. Or am missing something? Buying more drives for backup is not really doable, as drives do generally cost a buck and I cannot/ don’t really want to afford buying more drives.
You can backup to an external USB drive (that’s what I do), or setup a small backup server (with RAID if you want).
If you use Borg it will do the right thing out of the box with minimal configuration - compression, deduplication, encryption, and incremental backups.
The first backup will be full and take longer, but subsequent backups will only target changes and will be quite fast.
Restoring is very straightforward, even if you only need a single file you deleted accidentally.
Thanks, I will look into that!
2 disks in the same machine is not a backup whether the data is copied between them using RAID or rsync or anything else.
Sounds like for this machine, just use the two disks in RAID1, or a ZFS mirror, or something. And figure out something else for backups. Probably a cloud solution.
Also, RAID2 requires a minimum of 3 disks, and is rarely used.
I’m not an expert but I’d mix a RAID with a backup. And the RAID could be a 10 or a 01, but better read about all the types and choose the one that better suits your need.
I think it would be best to just delete the files, so you can get used to losing them. Maybe set a cronjob to delete them regularly.
Ok so of course the best solution is both raid and offsite backups. After that the question is how much do you want to prioritize convenience vs not losing you data at all costs. For irreplaceable data that you care deeply about not losing make an offsite backup it’s not worth the risk, never rely on raid to keep your data safe it may not always work. For data that is replaceable and you don’t really want to lose temporary access to in the event of a drive failure then raid is fine it most likely won’t fail.
If you are making a backup please make it offsite if you can it just provides one massive extra layer of security that makes it much harder for any permanent data loss to occur. It doesn’t have to be in a NAS it can just be a hard drive by itself in a friend or relatives house, Whenever you come over and visit update the drive for any changes. One downside to this method is pretty obvious, in the event of your drive at home failing the data that you have backed up will be as old as the last time you backed it up, so if you are constantly creating new data that you can’t risk losing then this won’t work. And this goes without saying you should really have all of your data encrypted but especially your offsite backups I wouldn’t trust anyone with my data.
A good compromise and probably what I would do is use raid all your files and then use your 2tb hard drive to backup all of your really important data that you don’t want to lose. With this setup it’s possible to lose some data in the event that your raid setup doesn’t work for whatever reason or your house burns down but at least all of your important data is completely safe.