|
Building large file servers on a budget, where you mainly put a bunch of big cheap IDE disks together and get storage in the terabytes is something more and more people think about and finally do. If you have a closer look you might see, that you face some challanges if you pass the 2 TB landmark, at least when you are on a tight budget and look for some throughput, which becomes more and more important as the storage size increases. At the end you don't want to have a verly huge storage that needs days or weeks to be transferred to or from.
2TB within a single server - no problem at all Actually, my largest file server is about 2 TB. The system is based on a linux software raid (level 5) of fourteen 160GB PATA drives. The core of the system is a pentium 3 with an intel 815 chipset. As you might already expect a 815 mobo can't handle 14 drives of raid plus a system disk plus a dvd drive from stock. To get all the drives connected I made use of 3 additional PCI cards which each supports 2 IDE channels for 4 more drives (each IDE channel can handle one drive as master and one as slave). Ok, let's sum up: 2 IDE channels onboard + 3x 2 IDE channels with additional PCI cards equals to 8 IDE channels, which means 16 disks to drive (14 for the raid array, 1 dvd, 1 system disk). This solution is very cheap. You can get Intel 815 boards and pentium 3 processors for no money at ebay. The PCI IDE cards are also available (I made good experience with the highpoint chipsets) on ebay for small money. You can find soime more details about this setup here. BUT, the solution has a drawback as well. The most important drawback is the overload it generates on the PCI bus. The 815 PCI bus (and all PCI busses except on the pricy and dedicated server/workstation mobos) just has a theoretical maximum bandwith of 133MB/s (33MHz, 32bit). Modern hard disks do perform a transfer rate of 20MB/s minimum and UDMA33-133 can easily deliver that to the IDE controller. If you have a look at the block diagram of the 815 chipset on the left (other chipsets are similar in that respect) you will see the on-board IDE controller is directly connected to the southbridge. The other 12 hard disks are connected via their IDE controllers by means of a single PCI bus. Autsch! 12 times 20MB/s gives something about 240 MB/s peak load. As the PCI bus is limited to 133MB/s the PCI bus clearly acts as a bottleneck. What makes the situation worse, kernel 2.4 reports lost DMA interrupts from the IDE controller when you are doing heavy trafic on the raid. These lost inderrupts don't harm anybody, but I consider these kernel warnings as an evidence that we are pushing a bit hard. So we definitely reached a limit in throughput originated by the PCI bus. More than 2TB within a single server - not so easy First of all, you have to go with kernel 2.6 as 2.4 does not support raid partitions and filesystems beyond 2TB. Today this is not a problem, as 2.6 has become a standard stock kernel. It also looks like kernel 2.6 does no longer produce kernel warnings based on lost interrupts of an overloaded PCI bus. Even if the warnings have gone we still face the fact, that a 133MB/s PCI bus is not a good choice to handle even larger data volumes. It already is a pain to transfer 2TB of data on the system above, so it will even get worse when you stack up something above 4TB on a PCI bus. Don't do this! So I asked myself, what is in there with the new PCI-express chipsets/mobos that are quite cheap now and offer a much improved throughput? When you look at the block diagram of the popular nForce 4 chipset from nvidia you realize that it has 2 IDE PATA channels (4 disk drives) and 4 SATA channels (additional 4 hard disks) on die. When we assume that it is time for going with SATA for a new file server project, this means that the 2 IDE PATA channels are fine for the system disk and a dvd, but that the raid array should be built up on pure SATA. More SATA channels can now be added by additional controller cards (or onboard solutions of certain mobos) on the PCI-express bus (a maximum of three 1x lanes on current mobos) or the PCI "legacy" bus (with a maximum of 3 PCI slots on current mobos). So you have two busses to equip with additional disk controllers instead of the single PCI bus of the intel 815 chipset. This is a big advantage because you can decide wether to put additional SATA channels in by means of a SiI 3132 based card which is a 2 port SATA controller connected to the PCI-express bus by 1 lane or by means of a SiI 3114 which is a 4 port SATA controller connected to the legacy PCI bus. Let's take the EPOX EP-9NPA+ Ultra as an example, which provides 3 PCI-express slots (1x), 3 PCI slots, and another PCI-express slot (16x) for the graphic card. With additional "SATA to PCI-express" cards based on the SiI 3132 (like the Dawicontrol DC-300e) you can get two more SATA channels out of each PCI-express (1x) slot. As nothing else is taking load on the PCI bus you can put another two "SATA to PCI" cards based on the SiI 3114 (like the Dawicontrol DC-154) to the system and get another 8 SATA channels without overloading the PCI bus too much. Summing this up results in 4 (onboard) + 6 (3 times SiI 3124) + 8 (2 times SiI 3114) = 18 SATA channels for your raid solution. Whow! Having 18 disks on raid you should really think about raid level 6 (2 disks of redundancy). At the moment SATA disks with a capacity of 300GB are on a sweet spot (minimum price per GB, today about 0.30€/GB). Taking 18 of these drives into a raid level 6 gives a total usable capacity of 16 * 300GB = 4.8TB! I know, now you will say that 18 drives (plus dvd, plus system disk) do never fit in a normal and cheap server tower. Well, have a look at the Chieftec Bravo BA-01B-B-B, a big tower case for about 80€. This case gives 8 drives a home in the lower compartement and provides six 5.25inch slots in the upper part. One of these is for your dvd rom, the other 5 can be converted into a housing of at least 7 hard disks by a handmade mesh housing (look here ). So you have room for 8 + 6 raid disks and a dvd and the system disk without any major hassle. Where do I place the remaining 4 disks? Oh, that is easy. You have plenty of room above the power supply to put the remaing 4 drives there. Voila! As you see, by using the new budget pci-express platforms and a mix of PCI-express and legacy PCI cards with SATA controllers you can really get a very fast 4.8 TB file server - right now! |