
www.Usenet.com
| <-- __Chronological__ --> | <-- __Thread__ --> |
"David Sworder" <[EMAIL PROTECTED]> wrote in message news:[EMAIL PROTECTED] > This is really great information. I apologize for the basic questions, > but I've only been examining this stuff for the better part of one day. :) > Let me ask you a few follow ups... > > > Just to put a number on things, let's say you've got a RAID 5 array of > > five drives with 5ms typical access time (pretty optimistic). So each > > drive can do about 200 I/Os per second. So you could sustain > > something like 1000 random reads per second (with zero writes).... Just to be a bit more complete: As Robert noted, 5 ms. for an average single random access is a bit optimistic: the fastest current 15Krpm drives take about 5.5 ms., 10Krpm drives take more like 7 - 8 ms., and 7200 rpm ATA drives take 12 - 13 ms. However, that's for requests submitted serially, such that one request is satisfied before the next is submitted. If the workload performs many tasks in parallel such that multiple requests can be submitted without waiting for any to complete (as FC and SCSI disks allow but most ATA disk to not - yet), the average latency goes up (because all but the first one satisfied is waiting in a queue) but the throughput does as well (because the disk can pick an optimal order in which to satisfy them that minimizes the latency betweeen them): if your request stream has sufficient parallelism, the throughput of an individual disk can easily double - though each request will sustain on average much more latency than it would in a serial stream, so if individual response times are critical spreading the requests across a larger array will improve it even though the per-disk throughput will decrease. > > I don't quite understand this concept. You've got five drives, each of > which can handle 200 I/Os per second. You're multiplying 5*200 to get 1000 > IOPs for the array. I understand your calculation but I'm not sure why it > works as you state. In a trivial example, let's say the RAID controller is > instrutcted to read 5 bytes of data. This is considered one IO by the RAID > controller, but doesn't the RAID controller then have to issue *5* read > commands, one to each disk? My understanding of RAID (as it applies to > reading data) is that the 5 disks would always be accessed simultaneously in > order to speed up the read process. So for each IO read-request that the > RAID controller receives, it has to issue 5 IO requests, one to each drive. > So it seems that the RAID controller would *still* be limited to 200 IOPs, > regardless of how many drives on are on the array. Why is it that you say > the reality of the situation is that the RAID controller can actually handle > 1000 IOPs? I don't understand. As already noted, most RAID implementations do not work this way: instead, data is spread across the disks in the array in coarser chunks - usually no smaller than 4 KB per disk, often 64 KB per disk, and there are good reasons in most workloads to make them even larger. Some early implementations of RAID-3 distributed the data at finer grain (much as you describe above), but I've never heard of RAID-0, -1, -4, or -5 doing so. > > > With extensive caching at the RAID controller you can get higher > > numbers, but on modest sized systems like the one you're discussing, > > it's almost always a better idea to put extra cache into the server > > instead of the disk subsystem. > > When you say that the extra cache should be put in the server but not on > the RAID controller or disk subsystem, what do you mean exactly? Where in > the server would I want to increase the cache? Just adding server RAM will normally suffice: the operating system should put it to good use caching data for most workloads, though a few (those that perform lots of small writes and require that each complete before the next is submitted) might better benefit from cache in the array controller. Having *some* cache in the controller that allows it to defer disk writes until a convenient opportunity (and hence significantly decrease their overhead) is desirable, though. It must be non-volatile (such that its contents aren't lost if power fails: some people trust a simple external UPS to suffice here, but having a back-up battery right on the array cache card tends to be safer), and to provide safety equivalent to the RAID array behind it it really needs to be duplicated (otherwise, it becomes a single point of failure). - bill
| <-- __Chronological__ --> | <-- __Thread__ --> |