10 Aug., 2007
The hard disk has one or more metal platters coated top and bottom with a magnetic material similar to the coating on a VCR magnetic tape. In the VCR the tape moves by a fixed recording and sensing device (the "head"). In a disk, the recording head is on a movable metal support called the "arm".
Information is recorded onto bands of the disk surface that form concentric circles. The circle closest to the outside is much bigger than the circle closest to the center. Since each metal platter has a top and bottom surface, there are at least two magnetic circles for each size and location. However, a disk may have as many as five platters, producing ten of these identical circles at the same distance out from the center.
There is a separate magnetic read/write head for each disk surface. With five platters there are ten heads. They are all fixed on metal "arms" that can move the head from the innermost to the outermost circular position on the disk surfaces. The arms all move together, so if there are 10 disk surfaces the read/write heads are in the same location on all 10 surfaces simultaneously.
One circle on a disk surface, equivalent to one position of the read/write head on the end of the arm, is called a track.
All of the tracks on all of the surfaces that correspond to the same arm position used to be called a cylinder. However, since the outer circles are bigger they can hold more data, and modern disks hide the actual location of the arm so that no operating system can determine exactly where it is. So today the term "cylinder" is typically used to refer to a certain number of megabytes of disk space that can be assigned to a partition (a disk letter in Windows).
To find some particular piece of data, the disk must move the arms to the position where it is located. This is called a seek.
Then it must wait for the data on the disk surface to rotate around until the start of the data comes under the read head. This is called rotational latency.
There are three types of disks in common use:
Desktop disks typically hold 250 to 500 gigabytes, although 750G disks are available and 1 terabyte disks are coming. Laptop disks are available in up to 250 gigabyte sizes, though 80 to 160G is more common. Enterprise disks come in 150 and 300G sizes.
Desktop and laptop disks come with parallel ATA and SATA connectors. Newer systems use SATA, so parallel ATA is needed only to replace or upgrade an older system. Enterprise disks typically use SAS (Serial Attached SCSI).
Western Digital produces one disk family called the Raptor. This is an Enterprise class disk with a SATA connector sold to desktop users. It is by far the best performer for your primary (operating system) disk, but it does have a higher cost. However, the dirty little secret is that 99% of ordinary Windows, Office, Web Browsing, and EMail reading doesn't use much CPU or video and will not be improved by the stuff you usually spend money on. If instead, you bought the absolutely lowest cost mainboard, CPU, memory, and video and replaced the ordinary hard drive with a Raptor your typical user would find the system boots twice as fast and runs faster than some pimped out gaming rig costing five times as much but equipped with ordinary disks.
It requires more power to rotate a disk faster and to move the arms (seek) faster. At the highest speeds, limitations on how fast the heads can read and write limit the amount of data you can put on the surface. This produces a variety of neat tricks:
The conclusion is that while all disks work about the same way, in any given generation of hardware there are opportunities to tweak, optimize, and specialize. Be careful when you purchase a disk intending to get a general purpose system that you aren't buying one of the disks optimized for video surveillance.
Although I have described all 3.5" 7200 RPM SATA disks as "Desktop" class, some are a little more rugged than others. They are sold for Enterprise applications that require bulk storage but not high performance. While database should go on SAS disks, there are times when a 750G or 1T disk with lower performance makes better sense. Enterprise SATA disks typically cost a bit more, are more durable, and have a longer warranty. Some have reduced error correction which turns out to be a good thing. Desktop disks are programmed to retry and retry any failing request. If there is a bad spot on the disk, the system can hang up for as much as a minute while the disk tries over and over to read the bad data. This is not always a good idea. With reduced ("normal" error recovery in historical terms) the disk will only retry a rational number of times and will respond in a second or two with an error if the data is bad. Disks with reduced error recovery may be the one type of specialized disk model that every intelligent user could plausibly select for general use.
You have now been given all the numbers, but you won't appreciate what they mean until we run a few calculations.
Suppose you read some random small pieces of data scattered all over the disk. Each time you jump from one piece of data to the next, you have to wait for a seek to complete and then for the rotational latency. If you add the 9 millisecond average seek to the average latency (half of a single 7200 RPM rotation) you discover that a typical desktop hard drive can do about 75 random reads per second.
Most people measure disk performance in terms of sequential transfer of large data files. A desktop disk can transfer 40 to 60 megabytes per second if all the data is written sequentially on consecutive tracks and cylinders. However, if you are reading 4K of chunks of data scattered randomly around on the disk surface, you can only read 300K of that data in the 75 operations you perform each second.
There is a massive difference between 300 thousand and 60 million bytes per second. This is why you cannot really state what disk performance is going to be unless you know what you are asking it to do.
Obviously you improve overall performance by periodically defragmenting the system disk to make data sequential. That is not, however, the whole story. Suppose you are reading two sequential files that happen to be on different parts of the same disk. If the system reads 4K from one file and then 4K from the other file, with a seek and rotational delay between each file switch, then it will be experiencing data transfer in the 300K per second range. If, on the other hand, the system reads 10 megabytes from one file and then 10 megabytes from the other file (and caches up the data in memory even before the program asks for it) then performance will be close to the 60 megabytes per second.
Operating systems will detect when a program is reading data sequentially and will "read ahead" to reduce arm movement and improve performance. However, this type of automatic optimization is conservative and only makes things two or three times better.
The place where this is most obvious is when you are editing or reencoding a large video file. If you try to save the output file onto the same disk where you are reading the input file, then the arm will be jumping back and forth. The processing will take 10 times as long as it would if the output file is written to a different hard disk.
Although modern computers have lots of free memory and could solve the problem with read-ahead and buffering, given current operating system behavior the real solution to this problem is to have more disks and to optimize locations so that whenever possible, different files being used by the same program are on different disks.
Enterprise disks run twice as fast as desktop disks. This is better, but it doesn't solve what is not a 600K per second or 80M per second spread (which is still a factor of 100 times slower). However, while corporate disks may not make much different, the microcode in a RAID adapter or Storage Area Network (SAN) may do more aggressive read-ahead that the operating system. Just don't count on it.
There is no substitute for manually optimizing things that you do all the time. In a corporate database, for example, you always put the "log" file on a physically different disk than the data (or the arm will be constantly jumping between the two). A desktop computer user should think about how often he copies or processes large files (for example, removing commercials from recorded TV programs). Putting the input and output files on different disks will cause the operation to run more than 10 times faster.
Using cache (in the computer memory, on the disk controller, or on the disk itself) will optimize the random requests you cannot anticipate. Careful positioning of data for things you do over and over will have a much greater effect. You might think it would cost more money to have two disks than to have one, but it depends on the amount of data you store. Two 250 Gigabyte desktop computer disks actually cost less than one 500 Gigabyte drive, and three of them cost a lot less than one 750 Gigabyte disk. You pay a premium for large devices, but a larger number of small independent devices will always perform better. Of course, you must plan for such a configuration. You need room in the case for more drives, and you need SATA connectors on the mainboard, and you need power from the Power Supply.
One of the most widely quoted performance characteristics is totally meaningless. A desktop disk can read data at a maximum rate of 40 to 60 megabytes per second. An ATA or Serial ATA connection may be advertised at 100, 150, or 300 megabytes per second. That speed represents the burst speed for transferring data from the disk cache to the computer, but the disk performance still remains a tiny fraction of this nominal transfer rate.
The IBM PC had a disk controller card. By 1990 it became possible to put the control logic on the disk itself, and since then disk electronics has become smarter to perform better or store more data. Some type of connection was needed between the mainboard and the disk. To simplify and save money, this connection was based on the speed, commands, and behavior of the mainboard adapter cards. This bus was standardized by IBM in 1985 under the "PC AT" brand name, so the disk connector became the AT Attachment or ATA.
Originally ATA used a flat "parallel" ribbon cable with 40 pins. That was fine for up to 16 MHz, but at high clock speeds there was too much interference between adjacent wires. The solution was to add a dummy ground wire between each signal wire, producing the modern 80 wire cable that connects to 40 pins (where every other wire is grounded to block interference).
Parallel ATA lasted for over a decade, but like all parallel technologies in the computer it eventually ran into a barrier. Modern chips can transmit data down a single pair of wires much faster than it can coordinate signals coming in through 40 parallel wires. This produced the Serial ATA or SATA disk, cable, and mainboard controller. SATA has one pair of wires to transmit and one pair to receive, but like the old ribbon cable it adds three dummy ground wires in between the signal wires to avoid "crosstalk" interference.
Although the SATA cable and connector are quite different, the signals sent between the mainboard and the disk are exactly the same as the signals transmitted over the old 40 pin flat parallel cable. It is possible to convert a parallel connector into a serial plug and visa versa, although today both types of disks are readily available and such conversion is an unnecessary expense.
In 1985 the PC AT transmitted data to or from the disk two bytes at a time using IN or OUT CPU instructions. The CPU had to go into a loop transferring data two bytes per instruction until the entire block had been transferred. This is called Programmed IO (PIO) and the ability to operate this way is still built into every computer (for compatibility and for use at boot time) even though much faster and more efficient bulk data transfer mechanisms are available.
Unfortunately, when something is wrong with your disk cable or its connectors, bad enough to interfere but not to make data transfer completely impossible, Windows will discover the problem and, instead of generating a big message on the screen to reconnect or replace your disk cable, it will silently fall back into PIO mode. At that point you disk performance goes to hell and your CPU utilization runs at 100% to do even trivial I/O operations. Look in Windows Device Manager at the ATA disk controllers. There are two devices on each controller, and Windows will show you what speed they are running at. If any is in PIO mode, shut the system down and replace or replug the disk cables. Note that PIO can be selected as a fall back even for SATA disks.
An old Parallel flat ATA cable has two devices. One must be declared to be M (address 0) and the other must be S (address 1).
Each disk or DVD drive has jumpers. One setting is for M, the other for S. If both disks on a cable are set to M (or both to S) then they won't work. Modern disks have an alternate jump setting called Cable Select. If all disks are set to Cable Select, then the disk at the end of the cable will become M and the disk connected to the middle connector will be S.
SATA cables connect one disk to the mainboard. The M or S is handled automatically and is never a problem.
SATA is better than old parallel ATA and will be more useful in the future. Since a desktop disk can only transfer data at around 60 megabytes per second, the difference between the 150 MB/s and 300 MB/s SATA connections is meaningless.
The only interesting desktop choice is the Western Digital Raptor, which seeks twice as fast and rotates at 10000 RPM instead of 7200 RPM.
It is generally cheaper or the same price to buy two 250G disks as one 500G disk. More disks allow you to spread files around and optimize the performance of frequently repeated operations (or separate the I/O of different things you do at the same time). So look at the number of disks you can put in your case and the number of SATA connectors on your mainboard and plan to fully populate your system with disks (rather than putting all your money into a single 750G disk).
There has been a lot of loose talk about hybrid disks with some flash memory. They have not appeared as expected and will not be discussed until they are a real alternative.
SCSI is the name of a particular family of connectors and a command set. However, in modern technology SCSI is the type of connection of Enterprise class disks with 4.5 millisecond seek time and 10K or 15K RPM.
The SCSI command set does provide more powerful options for optimization than is available for dumb ATA drives. However, effective use of this command set requires a lot of disks and some fairly powerful system software.
Modern Serial Attached SCSI (SAS) disks use basically the same type of connection as a SATA disk. One pair of wire sends data from the computer to the disk while another pair sends data from the disk to the computer. Different grades of wire can then be used to extend the distance between the computer and the disk so that SAS disk can be a few meters away from the computer. In corporate server rooms, the disks can be in a different part of the rack than the CPU connected to them.
The most valuable part of the SAS architecture is seldom used, particularly when all you do is buy a standard server from Dell. Each SAS disk has two data connectors and has an "address" similar to the address on an Ethernet adapter. Each SAS controller has four or eight connectors and each of those connectors has an address. Now inside a Dell server each disk has one of its two attachments connected to one port on the controller. The effect is almost identical to SATA.
However, just as many computers, printers, and routers can be connected to each other using Ethernet cables and switches, so it is possible to get SAS switches that connect a bunch of disks to a bunch of computers. Physically any computer could in theory talk to any disk, but you configure each computer with the addresses of the disks it should use. Everything works the same until a computer (or maybe just a RAID adapter card) fails for some reason.
Then if you are using the full SAS architecture, a computer may have a second RAID card that can also get to the same disk using a different path. Or if the entire computer goes down, a backup computer can connect to the disks and pick up where the first computer left off. All of this is possible given the SAS architecture, but it is not typically the solution that hardware vendors sell. Instead, they treat SAS disks the same as an older generation of SCSI disks, and if you want to store disks separate from the computer case itself then the vendors would rather sell you some type of SAN device for big bucks rather than a SAS switch.
Copyright 1998, 2007 PCLT -- Introduction to PC Hardware -- H. Gilbert