I have for a while a ubuntu server where I selfhost for my household syncthing (automatic backup of most important files on devices), baïkal, magic mirror and a few other things via docker.

I was looking at what I have now (leftovers of a computer of mine, amd 2600 with 16 gb ram with a 1660 super and a western digital blue ssd of 512GB), and regarding storage wise, at the time I decided to get several sort of cheap ssd’s to have enough initial space (made a logical volume out of 3 crucial mx500 1TB, in total making 3TB). At the time I though I wanted to avoid regular hdd at all costs (knew people who had issues with it), but in hindsight, I never worked with NAS drives, so my fear over these hdd with such low usage is sort of uncalled for.

So now I am trying to understand what can I change this setup so I can expand later if needed, but also having a bit more space already (for the personal stuff I have around 1.5TB of data) and add a bit more resilience in case something happens. Another goal is to try to make a 3-2-1 backup kind of solution (starting with the setup at home, with an external disk already and later a remote backup location). Also, I will probably decommission for now the ssd’s since I want to avoid to have a logical volumes (something happens on one drive, and puff all the data goes away). So my questions regarding this are:

  • For hdd’s to be used as long term storage, what is usually the rule of thumb? Are there any recommendations on what drives are usually better for this?
  • Considering this is going to store personal documents and photos, is RAID a must in your opinion? And if so, which configuration?
  • And in case RAID would be required, is ubuntu server good enough for this? or using something such as unraid is a must?
  • I was thinking of probably trying to sell the 1660 super while it has some market value. However, I was never able to have the server completely headless. Is there a way to make this happen with a msi tomahawk b450? Or is only possible with an APU (such as 5600g)?

Thanks in advance

PS: If you guys find any glaring issues with my setup and know a tip or two, please share them so I can also understand better this selfhosted landscape :)

  • IsoKiero@sopuli.xyz
    link
    fedilink
    English
    arrow-up
    2
    ·
    6 days ago

    My personal opinions, not facts:

    For hdd’s to be used as long term storage, what is usually the rule of thumb? Are there any recommendations on what drives are usually better for this?

    Anything with a long history, like HGST or WD (red series preferably). Backblaze among others publish their data on longevity of drives, so look for what they’re offering. On ebay (and others) there’s refurbished drives available which are pretty promising, but I have no personal experience on those.

    Considering this is going to store personal documents and photos, is RAID a must in your opinion? And if so, which configuration?

    Depends heavily on your backup scheme, amount of data and available bandwidth (among other things). Raid protects you against a single point of failure on storage. Without raid, you need to replace the drive, pull data back from backups and while that’s happening you don’t have access to the things you stored on the failed disk. With raid you can keep using the environment without interruptions while waiting for a day or two for a replacement. If you have fast connection which can download your backups in less than 24 hours it might be worth the money to skip raid, but if it takes a week or two to pull data back, then the additional cost of raid might be worth it. Also, if you change a lot of data during the day, it’s possible that a drive failure happens before backup is finished and in that case some data is potentially lost.

    On which level of RAID you should use, it’s a balancing act. Personally I like to run things with RAID5 or 6 even if I have a pretty decent uplink. Also, you need to consider what’s the acceptable downtime for your services. If you can’t access all of your photos in 48 hours it’s not a end of the world, but if your home automation is offline it can at least increase your electric bill for some amount and maybe cause some inconvenience, depending on how your setup is built.

    And in case RAID would be required, is ubuntu server good enough for this? or using something such as unraid is a must?

    Ubuntu server is well enough. You can do either sofware raid or LVM for traditionald RAID setup or opt for a more modern approach like zfs.

    I was thinking of probably trying to sell the 1660 super while it has some market value. However, I was never able to have the server completely headless. Is there a way to make this happen with a msi tomahawk b450? Or is only possible with an APU (such as 5600g)?

    No idea. My server has a on board graphics, but I haven’t used that for years. But it’s a nice option to have in case something goes really wrong. You can still sell your 1660 and replace that with the cheapest GPU you can find from ebay/whatever, at least as long as you’re comfortable with the console you can fix things with anything that can output plain text. If your motherboard has separate remote management (generally not available in consumer grade stuff) it might be enough to skip any kind of GPU, but personally I would not have that kind of setup, even if remote management/console was available.

    If you guys find any glaring issues with my setup

    I don’t know about actual issues, but I have spinning hard drives a lot older than my kids which still run just fine. Spinning rust is pretty robust (at least in sub 4TB capacity), so unless you really need the speed traditional hard drives still have their place. Sure, a ton more of spinning drives has failed on me than SSD’s, but I have working hard drives older than SSD as a technology has been around (at least in the sense of what we have now), so claiming that SSD’s are more robust (at least on my experience) is just a misnderstood statistics.

    • ZeDoTelhado@lemmy.worldOP
      link
      fedilink
      English
      arrow-up
      0
      ·
      6 days ago

      Thanks for your insights. Yes you are for sure correct. There was a time I had friends of mine losing everything because of spinning drives. But then again, none of them were nas grade (and also, was a time having 128gb was an absolute luxury).

      As for RAID, I was asking since it is something I hear people a lot doing. On my situation, my plan is to always have an external ssd with me plus a future remote like location for last ditch effort to save the data if really needed. So maybe it is OK for me to skip it. (And if I don’t have access to my photos for a week, no one dies)

      • atzanteol@sh.itjust.works
        link
        fedilink
        English
        arrow-up
        1
        ·
        5 days ago

        SSDs fail too. All storage is temporary…

        Setting up a simple software raid is so easy it’s almost a no-brainer for the benefit imho. There’s nothing like seeing that a drive has problems, failing it from the raid, ordering a replacement, and just swapping it out and moving on. What would otherwise be hours of data copying, fixing things that broke, and discovery of what wasn’t backed up is now 10 minutes of swapping a disk.

        • ZeDoTelhado@lemmy.worldOP
          link
          fedilink
          English
          arrow-up
          1
          ·
          5 days ago

          This is something I still don’t fully understand because raid in itself has so many bizarre terms and configurations that for the initiated is just really hard to understand, unless you really take time to dive into it.

          So my question is: when you tall about software raid, which configuration you mean? And also, how many drives are needed to do such configuration? Thanks in advance

          • atzanteol@sh.itjust.works
            link
            fedilink
            English
            arrow-up
            2
            ·
            5 days ago

            Yeah - that’s fair. I may have oversimplified a tad… The concepts behind RAID, the theory, implementations, etc. are pretty complicated. And there are many tools that do “raid-like-things” with many options about raid types… So the landscape has a lot of options.

            But once you’ve made a choice the actual “setting it up” is usually pretty simple, and there’s no real on-going support or management you need to do beyond just basic health monitoring which you’d want to do even without a RAID (e.g. smartd). Any Linux system can create and use a RAID - you don’t need anything special like Unraid. My old early-to-mid-2010’s Debian box manages a RAID with NFS just fine.

            If you decide you want a RAID you first decide which “level” you want before talking about any specific implementations. This drives all of your future decisions including which software you use. This basically focuses on 2 questions - how much budget do you have and what is your fault tolerance?

            e.g. I have a RAID5 because I’m cheap and wanted biggest bang-for-the-buck with some failure resiliency. RAID5 lets me lose one drive and recover, and I get the storage space of N-1 drives (1 drive is redundant). Minimum size for a RAID5 is 3 drives. Wikipedia lists the standard RAID levels which are “basically” standardized even though implementations vary.

            I could have gone with RAID6 (minimum 4 disks) which can suffer a 2 drive outage. I have off-site backups so I’ve decided that the low-probability of a 2 drive failure means this option isn’t necessary for me. If I’m that unlucky I’ll restore from BackBlaze. In 10+ years of managing my own fileserver I’ve never had more than 1 drive fail at a time. I’ve definitely had drives fail though (replaced one 2 weeks ago - was basically a non-issue to fix).

            Some folks are paranoid and go with RAID1 and friends (RAID1, RAID10, etc.) which involves basically full duplication of drives. Very safe, very expensive for the same amount of usable storage. But RAID1 can work with a minimum of 2 drives. It just mirrors them so you get half the storage.

            Next the question becomes - what RAID software to use? Here there are lots of options and where things can get confusing. Many people have become oddly tribal about it as well. There’s the traditional Linux “md” RAID which I use that operates under the filesystems. It basically takes my 4 disks and creates a new block device (/dev/md0) where I create my filesystems. It’s “just a disk” so you can put anything you want on it - I do LVM + ext4. You could put btrfs on it, zfs, etc. It’s “just a disk” as far as the OS is concerned.

            These days the trend is to let the filesystems handle your disk pooling rather than a separate layer. BTRFS will create a RAID (but cautions against RAID5), as does ZFS. These filesystems basically implement the functionality I get from md and lvm into the filesystem itself.

            But there are also tools like Unraid that will provide a nice GUI and handle the details for you. I don’t know much about it though.

            • ZeDoTelhado@lemmy.worldOP
              link
              fedilink
              English
              arrow-up
              1
              ·
              5 days ago

              Thanks for the reply. The breakdown is very good and I can actually see a lot of reasoning on your situation that I also would share (I do not have vast amounts of money to throw at this + only one drive failing and 2 handle the boat sounds about right).

              As for the way to do the software raid, I’ve seen MD somewhere before but I honestly forgot. Since people tend to talk about unraid a lot. From my perspective, I would probably go as simple as possible, although I will be studying how effectively MD works.

              Great reply :) learned a lot

              • atzanteol@sh.itjust.works
                link
                fedilink
                English
                arrow-up
                1
                ·
                edit-2
                5 days ago

                Sure thing - one thing I’ll often do for stuff like this is spin up a VM. You can throw 4x1GiB virtual drives in it and play around with creating and managing a raid using whatever you like. You can try out md, ZFS, and BTRFS without any risk - even unraid.

                Another variable to consider as well - different RAID systems have different flexibility for reshaping the RAID. For example - if you wanted to add a disk later, or swap out old drives for new ones to increase space. It’s yet another rabbit hole to go down, but something to keep in mind. When we start talking about 10’s of terrabytes of data you start to lose somewhere to temporarily put it all if you need to recreate your raid to change your raid layout. :-)