RAID Setup for a server

MatiasZ

Regular
Hi everyone,

I'm building up a server, and I've had some excelent help from Albuquerque and the maddman in this thread regarding gigabit network.

I'm posting this other thread because I want your opinion regarding RAID configuration. I have 6x1TB drives for a RAID10 array for data, and 2x320GB in RAID1 for the OS. My current doubts are:

- Which block/strip size to use for the RAID10? Files that require high performance are big, (various megabytes up to a couple gigabytes), but there are also lots of files in the 10-100KB range. If it's needed, I could create some sort of graph with current filesize distribution.

- Where would you connect the OS drives, the Adaptec PCI-E x8 controller, where the RAID10 is (two ports are available so all would be full), or the motherboard controller (Intel ESB2)?

Thanks in advance for any sugestions!
 
RAID 10, man that's a lot of redundancy, a lot of controller overhead and a lot of lost space. You'll end up with 3TB in storage after purchasing 6TB worth in drives; that would irk me if it were my hardware. As a matter of personal preference, I'd go for a "normal" RAID5 (5TB available), or if you're really pessimistic, go for a RAID5 + online spare (4TB available).

Anyway, back to the original discussion:

What speed are those drives? Are they SCSI or SATA? Are you going to break up all that disk space into multiple partitions, or just leave it as one HUGE drive letter? What file system(s) will you use on the partition(s)? What will this storage be used for -- general file share, print spooler folders, massive databases, or other worth-mentioning data types? And finally, how ridiculously busy do you expect the partition(s) to be -- are we talking 50 user load, or 5000 user load? :)

Oh, and just for the record, I like and fully agree with the RAID1 setup for OS drives :)
 
Last edited by a moderator:
from the other thread : "It is a Quad Xeon, with 4GB of ram and 6TB of HD in a RAID10 setup (3TB effective storage), which will be used to serve about 4-6 concurrent clients for HDV video editing."

I'm not sure if a raid 5 can do the job. a raid 5 is known to be great at storing and reading but not for writing, and it looks like there will be heavy writing.
Raid 10 is simpler and faster, and I also guess they have some kind of pretty big storage solution on another server.
but then again I don't have much experience about that stuff :oops:
 
Last edited by a moderator:
Ah, good catch. Missed that entirely, thanks Blazkowicz :)

Other than the obvious link to underlying drive hardware, RAID 5 performance is almost entirely at the mercy of the controller. As such, it's important to know which Adaptec controller you bought for the task...

Sounds like you'll end up with one really big file share either way, and you'll likely want to target "big" file performance since the files you'll likely be dealing with will be pretty massive. I'd suggest 64K stripes, with at least 4K block sizes in the file system.
 
Wow, that was fast :)

Albuquerque, I think most of your questions on the first post were already answered, but just to be sure:

What speed are those drives? Are they SCSI or SATA?
1TB SATA 7200RPM Seagate Enterprise 32MB cache

Are you going to break up all that disk space into multiple partitions, or just leave it as one HUGE drive letter? What file system(s) will you use on the partition(s)?
One huge (as big as it can get!! :D) partition, NTFS.

What will this storage be used for?
- Audio and video (in HDV) file server for editing (read/write+videocapture destination if it works as it should - I've had some bad experience with network "issues" that would drop frames)
- General file share (documents)

In the server there will be some other services, but not relying on this RAID (10 or 5) solution

And finally, how ridiculously busy do you expect the partition(s) to be -- are we talking 50 user load, or 5000 user load?
For the time being, 5-10 users. We are really small :D

I bought what seemed to be the best controller (for this type of system of course) from Adaptec with 8 port configuration: Adaptec 5805. According to this document it should be quite good.

I can try RAID5 if you guys think it could do the job, after all an extra TB is always welcomed ;)

Albuquerque, based on this (not so new) info would you still recommend 64K stripes + 4K NTFS bock?

Any suggestions on where to plug the other two hard drives? My first impression would be to go with the onboard controller, but I could be wrong.

Thanks again for all the help, this is incredible useful for me ;)
 
Well, to Blaz's point, RAID5 will end up being slower than RAID10, but it will net you a total of 5TB of space (RAID5 gives you n - 1 space with the last drive for parity). So it might be worth running some benches...

However, with those six drives you're going to be smokin' fast in just about any configuration. What will ultimately kill performance on this server will be fragmentation, as you don't have nearly as much seek speed as a 10K or 15K RPM set of drives would net you. As such, I'd set up a scheduled task for the wee hours on the weekends that defrags the partition for you. And if you really want to lean on it, you might consider 8k NTFS chunks... There are obvious tradeoffs as you make the filesystem blocks larger, but if the bread-n-butter of your business is with very large, very bandwidth-intensive file access, then you might as well aim large.

As for the other set of OS drives, yeah go ahead and plumb them into the onboard Intel raid controller. Performance might be argued either way, but your data partition will keep the Adaptec controller busy in it's own right. Why make it busier with goofy OS tasks?
 
Well, to Blaz's point, RAID5 will end up being slower than RAID10, but it will net you a total of 5TB of space (RAID5 gives you n - 1 space with the last drive for parity). So it might be worth running some benches...

However, with those six drives you're going to be smokin' fast in just about any configuration. What will ultimately kill performance on this server will be fragmentation, as you don't have nearly as much seek speed as a 10K or 15K RPM set of drives would net you. As such, I'd set up a scheduled task for the wee hours on the weekends that defrags the partition for you. And if you really want to lean on it, you might consider 8k NTFS chunks... There are obvious tradeoffs as you make the filesystem blocks larger, but if the bread-n-butter of your business is with very large, very bandwidth-intensive file access, then you might as well aim large.

As for the other set of OS drives, yeah go ahead and plumb them into the onboard Intel raid controller. Performance might be argued either way, but your data partition will keep the Adaptec controller busy in it's own right. Why make it busier with goofy OS tasks?

OK, I'm finally trying out the server. I setup the drives for the OS on the intel controller, and everything is up and running, with Server 2003 installed with all drivers and OS updates ready.

The system came with RAID10 configured, with a 512KB strip size, and I had to convert the drive to GPT in Windows to be able to use the extra size. Initial results using HDTach and HDTune: 335MB/s average read, 320MB/s average write. smokin' fast, as you said =D, or at least that's what it seems to me.

I know there's much debate about this, but considering that files that need fast access are allways >5MB (actually they are usually >20-50MB), why would you recommend 64K strip size instead of a larger one?

And also, what are the tradeoffs you mention of using 8K NTFS formatting?

I'll try to make some RAID5 benchs, but I would like to try real-life performance, which is were it will be used, but is actually hard to do it.

EDIT: I've tested RAID5 performance and it seems to be much more than enough, I'll post some benchs soon. Since I have an "outside" hot spare drive already (I bought 7 although I can only fit 6 in the case), I've used 5 drives+1parity for a total 4.6TB effective storage.

I've created a file distribution search on the current "server", just to see what filesizes I have, how many files in each size and how much space that takes:

file-distribution.jpg


Based on this, would you still recommend 64K for stripe size on RAID5 (adaptec's default is 256K), and would you use 8K NTFS cluster size? drive spage waste shouldn't be an issue since 99% of the storage is in >8K files, and I am not going to use NTFS compression as it is useless on audio/video files.
 
Last edited by a moderator:
The stripe size doesn't terribly affect file allocation, that's dependant on the block size of your NTFS file system.

I say just leave it as-is. Looks like you're doing pretty well :)
 
file-distribution.jpg


Based on this, would you still recommend 64K for stripe size on RAID5 (adaptec's default is 256K), and would you use 8K NTFS cluster size? drive spage waste shouldn't be an issue since 99% of the storage is in >8K files, and I am not going to use NTFS compression as it is useless on audio/video files.

The space wasted on cluster size is not how large the files are, but how much leftover space would be used. A 10K, 18K, 26K ... 802K file would generate the same amount of wasted space as a 2K file. You would have 6K "wasted". Think MODULO.

The most amount of space you'd have wasted would be your CLUSTER SIZE minus 1 Byte or (CL - 1). The larger the cluster the greater the potential for wasted space.

HOWEVER, you need to balance that out with how much overhead is needed to track which clusters are in use and which are available. Using a 64K cluster size would require 1/8th the overhead resources to keep track of compared to an 8K cluster. Using a 128K cluster size would require 1/16th the overhead resources.

With all that being said, neither seem to be that much of an issue.
 
The space wasted on cluster size is not how large the files are, but how much leftover space would be used. A 10K, 18K, 26K ... 802K file would generate the same amount of wasted space as a 2K file. You would have 6K "wasted". Think MODULO.

The most amount of space you'd have wasted would be your CLUSTER SIZE minus 1 Byte or (CL - 1). The larger the cluster the greater the potential for wasted space.

HOWEVER, you need to balance that out with how much overhead is needed to track which clusters are in use and which are available. Using a 64K cluster size would require 1/8th the overhead resources to keep track of compared to an 8K cluster. Using a 128K cluster size would require 1/16th the overhead resources.

With all that being said, neither seem to be that much of an issue.

You're right, I totally missed that. In a worst case scenario, I would loose NumberOfFiles*(ClusterSize-1). In my current distribution, with about 13000 files, in a 64K cluster worst-case scenario, I would loose about 800MB of space, with those same files taking up 550GB. This is about a 0,15% space lost. I can surely take that, BUT, I have no idea what I'm actually gaining with this. What kind of overhead am I reducing and how much and where does it impact in performance?
 
The overhead is computation and possibly seek time; smaller clusters means more of them, which means a bigger allocation table. More clusters can also accelerate file fragmentation to a certain extent, which then obviously can increase seek time.

There's really no perfect answer, the rule of thumb is generally to use the biggest size that makes sense. So while 64k clusters might burn up 800Mb in slack space, it also cuts down your file allocation table by 1/16th that of 4K clusters, and your disk has 94% less clusters which potentially reduces file fragments.
 
The overhead is computation and possibly seek time; smaller clusters means more of them, which means a bigger allocation table. More clusters can also accelerate file fragmentation to a certain extent, which then obviously can increase seek time.

There's really no perfect answer, the rule of thumb is generally to use the biggest size that makes sense. So while 64k clusters might burn up 800Mb in slack space, it also cuts down your file allocation table by 1/16th that of 4K clusters, and your disk has 94% less clusters which potentially reduces file fragments.

Sounds quite good. I think I'll go for a big size like 32K or 64K, and keep an eye on disk usage and fragmentation.

Thank you all for your answers, it's been incredibly helpful for me.

Best Regards,
Matias
 
Back
Top