Tom's Guide Forum
  Tom's Guide Forums » Storage » General Storage » Linux Fileserver - Hardware Questions - All Help appreciated
 




Word :   Username :  
 
Bottom
Author
 Thread : Linux Fileserver - Hardware Questions - All Help appreciated
 
More Information

I've got several projects that I am about to begin. Many of them are going to need a lot of disk space. So, I wanted to have a box (that would be shared by all projects) that was only responsible for storage. I'm looking to keep costs low, but not cut corners. Plus, I really like building things myself!

That said, here are my questions.

1) I want to keep power requirements down, so I want the most efficient processor for the task. If I am ONLY serving files and use a dedicated RAID card and GigE card, how much processing power would I really need?
2) In regards to question 1, with dedicated hardware for the more or less only system tasks does the motherboard make a difference? Any suggestions? Expandability?
3) I'm pretty much set on SATA, for price/performance, any recomendations for a RAID controller? I'll probably start out with a 4 disk array, but could move up to 8+ in the future... That said, one that allows for multiple controllers/system would be a plus.
4) I'm familiar with Red Had and Ubuntu, but am thinking of going Debian for this box? Any suggestions for a different distro?
5) This will be my first RAID setup, so any good places about how to allow this to scale from initially ~1TB to eventually ~8-10TB?
6) This will be a rackmount system, so reccomendations for cases? I would prefer hot-swappable drives, but not an absolute must especially if they are just easily accessible.

The main reason that I want this dedicated barebones fileserver is that I want it to be rock solid and independant of the different projects.

Any help on any of the questions is greatly appreciated. Also any other advice in general is welcomed as well!

Related Product

Register or log in to remove.

More Information

Quote :

I've got several projects that I am about to begin. Many of them are going to need a lot of disk space. So, I wanted to have a box (that would be shared by all projects) that was only responsible for storage. I'm looking to keep costs low, but not cut corners. Plus, I really like building things myself!

That said, here are my questions.

1) I want to keep power requirements down, so I want the most efficient processor for the task. If I am ONLY serving files and use a dedicated RAID card and GigE card, how much processing power would I really need?
2) In regards to question 1, with dedicated hardware for the more or less only system tasks does the motherboard make a difference? Any suggestions? Expandability?
3) I'm pretty much set on SATA, for price/performance, any recomendations for a RAID controller? I'll probably start out with a 4 disk array, but could move up to 8+ in the future... That said, one that allows for multiple controllers/system would be a plus.
4) I'm familiar with Red Had and Ubuntu, but am thinking of going Debian for this box? Any suggestions for a different distro?
5) This will be my first RAID setup, so any good places about how to allow this to scale from initially ~1TB to eventually ~8-10TB?
6) This will be a rackmount system, so reccomendations for cases? I would prefer hot-swappable drives, but not an absolute must especially if they are just easily accessible.

The main reason that I want this dedicated barebones fileserver is that I want it to be rock solid and independant of the different projects.

Any help on any of the questions is greatly appreciated. Also any other advice in general is welcomed as well!




Ok, sounds like fun.

My answers ( not always the right ones :) )

1. The amount of processor power you will need will depend in part on what type on Gig Eth0 card you use and raid controller. On board controllers take CPU clicks, so if your using both on board raid and on board eth0 then you'll want something starting at a P4 2.8+ or AMD 64 3000+. If you use PCI cards for your eth0 and raid, you wont need much cpu power at all, maybe PIII or Barton core, or low voltage AM2 would work here for sure. But I give those as examples hoping you'll find some of the parts laying around. Now thats just a recommendation, so no one get all snippy.

2. Good question, file servers don't need much if you configure them correctly. You can go two routes, used and cheap or new and upgradeable. If you don't have anything in your possession to begin your build now ( like ram and CPU) then you would always benefit from going with new. In this case I think you could get a cheap AM2 board & cpu with integrated video, more similar to real life servers.

3. Tom's did this article a while back. These card are good and cheaper now. Don't mind the PCI-X formats, those cards are compatible with PCI 2.0, they just wont work at 133mhz. In PCI or PCIe formats you'll only see about 5 max channels on one card( that are are affordable). Best practice would say that 5 is the max you want to flood in an array.

4. I would suggest Freenas setup or a fedora build, Only because its a launch pad for Redhat builds.

5. Your not going to typically migrate into to a 1TB to 10TB in one box, so you'll want to start small and work your way up, with maybe mirrored boxes.

6.Curve ball! if this is your case, then you'll want to get a chassis for the array. Your setup is going to be a large setup, I may suggest just buying NSM 160 or some type of iSCSI array. You'll save money. But you could always buy upgrade your own. If this is your first time, start small.


Good luck

More Information

I had sometime ago the same issue... How I solved it?

1) had a used computer (tbird 1100) and added alot of memory (it had a "fast" board - kt266a) - 2GB
2) used LVM+linux raid (only used hardware controllers to add ports as I was limited to the 4 ata ports) - most of the cheaper solutions are software raid anyway (even when using a raid controller) and linux implementation is very good.
3) sata seems the way to go
4) best way to choose the distro is to choose the one you know best... and your geek friends use ;) also google is very helpful
5) biggest advantage to use linux lvm + raid is that is isn't dependent on one single controller capacity. Can grow as much as you can add drives
6) hot-swappable is dependent the controller bios/raid controller. As to the case, you need to add alot of drives (LVM + raid of 10x750Gb ~= 6TB) so remember drive space, power cabling, pci slots to add controllers.... all count...

you can always have a look at sun x4500 ;)

in this type of machine, CPU is not very important, what's really important is memory (to cache HD), and I/O (fast hd's with fast controllers and with enough bandwitdh).

More Information

Quote :

I had sometime ago the same issue... How I solved it?

1) had a used computer (tbird 1100) and added alot of memory (it had a "fast" board - kt266a) - 2GB
2) used LVM+linux raid (only used hardware controllers to add ports as I was limited to the 4 ata ports) - most of the cheaper solutions are software raid anyway (even when using a raid controller) and linux implementation is very good.
3) sata seems the way to go
4) best way to choose the distro is to choose the one you know best... and your geek friends use ;) also google is very helpful
5) biggest advantage to use linux lvm + raid is that is isn't dependent on one single controller capacity. Can grow as much as you can add drives
6) hot-swappable is dependent the controller bios/raid controller. As to the case, you need to add alot of drives (LVM + raid of 10x750Gb ~= 6TB) so remember drive space, power cabling, pci slots to add controllers.... all count...

you can always have a look at sun x4500 ;)

in this type of machine, CPU is not very important, what's really important is memory (to cache HD), and I/O (fast hd's with fast controllers and with enough bandwitdh).


agree with most points, specially with the linux raid implementation. He wont need any raid controller, and if he's going to start from scratch, there are motherboards with like 8 sata ports, could go 8x750GB for instance (i'm lazy to calculate the total) only with on board ports.
But i kinda disagree about cpu. Depending on the raid level he will need a decent cpu, i guess an athlon64 3000 would be good, but not like dual cores like i've seen people saying on another topics.

More Information

OK with something of a 10TB area I would use a SAS Serial ATA on SCSI. It will allow you to have all the benifits of SCSI and the cheaper side of Serial ata. They can also hot swap. Since you would be connecting through a SCSI, you can put the storage core anywhere you can make the SCSI cable reach. Any of the newer SCSI controllers should be able to do this function, the main problem would be finding the interfaces between the SCSI and SATA drives. This is moving out of the NAS area and into a SAN type storage design. Greater expandability, greater data transfer speeds, but also a greater expense.

If your wanting to keep things cool, the best way to do it would be to go with an AMD Opteron dual core or with a new Conroe, either way the Wattage will be lower, therfore heat will be lower, and the dual core would be put to good use, esp. with file serving and both being 64bit.

Last your greatest investment will be in RAM, the greater amount of ram, the better your computer will serve to others around you. Depending on the MB you invest in, you can put upto 8gigs of ram on the machine. This would be the biggest factor on choosing a MB. You'll do well with either one of the proccessors mentioned above, but RAM and page swapping is the greatest concern.

As far as the OS, I would stick with what you have the greatest knowledge in, I use Fedora Core 5 for my cluster. Redhat does best as far as I'm concerned, but again that will be your choice.

More Information

Thanks for all of the input so far. I've been reading up on all of this and there is just soo much information. For future reference of anyone going through this same process, the only way your are going to keep from going crazy is to start finding certain aspects that you know for sure, because otherwise there are so many variables that you won't ever make any progress.

A couple of the responses mentioned using linux software RAID. I'm currently using it on one of my boxes and don't have any comlpaints, but it is just a simple RAID1 setup for basic fault tolerance.

I was pretty sure that I wanted to do a hardware solution, but if I'm going to reconsider going back to software then I've got a couple of questions:

1) Has anyone really used this for a multi TB setup? If so, how is performance? One of the functions this box will be supporting is hosting MythTV (linux version of TiVo) files (MPEG-2 I think). There will be at most 3 front-ends streaming from it, 2 of which could be HD...So, will that be able to handle that type of load?

2) How well does it scale with RAID5? If I started with say 3x750GB drives, can I just add in a 4th, 5th, etc and have the disk incorporated into the array?

3) How well does it handle failures? Can it support hot-spares and hot-swapping?

4) If I had 8 on-board SATA ports, would there be any way to add additional disks down the line?

5) An advantage of hardware RAID is that the host CPU isn't affected, but with software RAID it is. I've heard that any new processor is able to handle the loads, but that is usually from smaller number of drives and smaller sized arrays. If I did get this thing up to several TBs, is that going to be an issue?

Some good/positive answers to those questions could get me to start looking back further into a software solution, but does anyone have a good rule of thumb for when it starts becoming a better idea to go the hardware route?

More Information

Quote :


1) Has anyone really used this for a multi TB setup? If so, how is performance? One of the functions this box will be supporting is hosting MythTV (linux version of TiVo) files (MPEG-2 I think). There will be at most 3 front-ends streaming from it, 2 of which could be HD...So, will that be able to handle that type of load?


i've used just like you, as a raid 1 mirrored setup and i noticed it's noticeably faster than using an on board sata controler. Considering a dedicated card provides better performance than on board controllers, and the linux implementation is faster than on-board, it's positioned at least between a dedicated card and the on-board controller, with the advantage of cost.

Quote :

2) How well does it scale with RAID5? If I started with say 3x750GB drives, can I just add in a 4th, 5th, etc and have the disk incorporated into the array?


yes (http://www.tldp.org/HOWTO/Software-RAID-HOWTO-3.html#ss3.2)

Quote :

3) How well does it handle failures? Can it support hot-spares and hot-swapping?


same as above. About hot swapping, it depends on the controller chip. If you go SATA2, it supports hot swapping.

Quote :

4) If I had 8 on-board SATA ports, would there be any way to add additional disks down the line?


through a dedicated card, i guess. for example: 8 on board ports plus 4 on a dedicated card.

Quote :

5) An advantage of hardware RAID is that the host CPU isn't affected, but with software RAID it is. I've heard that any new processor is able to handle the loads, but that is usually from smaller number of drives and smaller sized arrays. If I did get this thing up to several TBs, is that going to be an issue?


I guess it depends on the load, not on the size. How many people accessing at the same time, in which frequency? Anyway, i think it's more important a setup that can handle high I/O than parity calculations for instance.

oh, at this site there's some good info:

http://www.tldp.org/HOWTO/Software-RAID-HOWTO.html

More Information

That does sound promising, but from that site in this section concerning reconfiguration and adding of disks the tool it recommended was stated as not "production ready."

I'd hate to lose several TBs. So, has anyone used that tool? Because, if I was understanding that correctly, that would be what I would use to add additional drives *after* the original array was created.

Further clarification of a scenario that I would want to be able to do incrementally (RAID5 w/750Gb Drives):
1 ) Create an initial 3 drive array (1.5TB)
2 ) Add 4th drive and increase capacity (2.25TB)
3 ) Add 5th drive as a hot spare
4 ) Add 6th drive and increase capacity (3TB)
5 ) Add 7th and 8th drives and increase capacity (4.5TB)

If I can do that, then I'm pretty impressed. But if I can:
6 ) Add an 4-port card
7 ) Add 2 drives to card and then be incorporated into existing array (6TB)
8 ) Add 2 more drives to card (7.5TB)

Then I'd be really impressed and am really going to start considering this as an option.

I did a pretty good amount of reading on software RAID when I set my other box up, and it is stable and has good performance. But since I wasn't looking at adding capacity I didn't read that much about its expansion capabilities.

More Information

hmmmmm i havent read that.
well, but the text below, about backups is really important too, and totally complements the text above. In my opinion, it's pretty much like what partition magic does in windows. It resizes an existing partition on the fly, and although some people say it's safe, on the other side i've seen people losing all their data. Partition magic is supposedly "production ready" and still has its risks.
Changing disks/partitions WITH data on them ir a risky task on it's own, and even raid being a solution which can increase data security, it does not eliminate the need for backups when you deal with it.
I, personally, never used any tool of this kind, neither on windows or linux. Even though i know there are programs that can resize partitions on the fly for both. When i want to change my partitions i simply backup all my data and re-partition from scratch, no resize.

By the way, when you said about adding disks i took a different approach. I thought about adding the disks on another volume, not to the existing one, that's why i haven't realized about the risk of using that tool

More Information

1) yes
2) yes
3) yes
4) yes
5) yes
6) yes
7) yes
8) yes

More Information

Quote :

1 ) yes
2 ) yes
3 ) yes
4 ) yes
5 ) yes
6 ) yes
7 ) yes
8 ) yes



Haha, "8 )" (without the space) is the code for smiley. I went back and added a space to all of mine to keep it from doing that...

Not doubting you, but have you (or someone you know) done it? Theory and practice are two different things. What utilities did you/they use?

It's just a good chunck of money (not to mention time and effort), and i would hate to get down a path to realize that it was a deadend.

Hardware & Firmware designer
More Information

Just a little input for CPU: my home file server and firewall is an... AMD K6-2 @350MHz!
192MB of RAM, 2x 100Mb eth, 1x 1Gb eth, 1x 80GB HDD (os drive), 3x 300GB HDD (raid 5) and it sits idle for 96% of the CPU time drawing only 65W!

You absolutely don't need a fast or modern HW for a file server, if you can put your hands on an old and cheap PC, pick it immediately!

More Information

Cool, I like the possibilities of going the software route.

Now, onto the next piece of the puzzle, motherboards.

1) What motherboards do people recommend?
2) I've found plenty that can support 8 SATA drives, but it is usually 4xSATA I and 4xSATA II. Are there any that support 8xSATAII?
3) What are the differences between "server" and "gaming" oriented motherboards? Am I going to notice a performance difference from one of the other?
4) For future expansion, what type of PCI (and its derivatives PCIe, PCI-X, etc) do you recommend?

Thanks, I finally feel as though I am making some progress on this journey!

More Information
n°1180113
08-02-2006 at 08:34:25 PM