15 April, 2008

The Rules of Circuits

To understand how circuits work, consider plumbing. Electricity runs through wire much like water runs through a pipe. The amount of water that flows through the pipe is equivalent to the current of electricity through the wire. The water pressure is equivalent to voltage.

There is one slight problem with the plumbing analogy. You fill a bathtub with water using the water pressure in the line, but you empty a bathtub by opening the drain and letting gravity push the water down. Circuits, however, use both positive and negative voltage, so the speed with which electric charge drains from a circuit is the same as the speed at which it fills (determined by voltage).

For a lamp, stereo amplifier, or drill, the measure of quality is more "power". The opposite is true for computers. A computer measures data as 0 and 1, on and off, empty or full. When we talk about a "powerful" computer, we mean a machine that can perform its calculations more quickly. However, each calculation only requires accurate measurement of the state of the circuits. Real electrical power, measured in "watts" like a light bulb, is waste heat.

In direct contact with the top of the CPU chip is a block of metal called the “heatsink”. It is solid at the bottom, but has cooling fins at the top. Waste heat generated by the CPU is conducted into the heatsink. A fan blows air over the fins. The heat is transferred to the air, which is then blown out the back of the computer. A 486 CPU generated about 4 watts of waste heat. A Pentium III generated around 25 watts. A Pentium 4 generated 80 watts. A modern quad core chip generates 95 watts for Intel and 125 watts for AMD.

Heat has become the most important problem in CPU design. Chips could be much faster if the engineers could find some way to reduce the waste heat. Unable to do this, engineers must content themselves with more efficient systems to conduct the heat away from the CPU chip and dump it into the room.

Size Matters (Small is Better)

To indicate a 1, you have to fill something. It doesn't matter what you fill as long as you can quickly fill or empty it and you can accurately measure whether it is full or empty. Obviously, it takes less time to fill a shot glass with water than it does a bathtub. Not only is it faster, but it takes less water (current), and you can do it with a much lower water pressure (voltage). You can also empty it faster.

So the trick to computer circuits is to make them as small as possible, so that filling them with electrons, or removing all the electrons, can be done in the least amount of time with the least work. The size of circuits is determined by the width of the smallest line that the chip manufacturing technology can draw.

Circuit technology is measured in nanometers (billionths of a meter). Recent generations of CPU chips used 250, then 130, then 90, then 65, and now 45 nanometer circuits. The next step will be 32 nanometers.

These numbers measure the length of a side. Since circuits are built in two dimensions, the actual measure of circuit size is volume or the square of the length measurement. If you square each off the previous numbers, you will see that the circuits in each generation are half the size of circuits in the previous generation.

A Dripping Faucet

As the size of circuits becomes smaller, chips have begun to leak current. As the size of wires and transistors got smaller, the space between the wires also shrank. This space is supposed to be an insulator, but as the distance between two wires gets smaller and smaller, some current begins to leak across the thinner insulation barrier.

There is also a problem inside the transistors themselves. In a computer, a transistor circuit is supposed to be "on" or "off". We think of this difference like a light switch, but following the analogy with plumbing it is something like a faucet that is open or closed. The problem here is that as the size of the transistor gets smaller, some current leaks through the insulating barrier when the transistor is "off". It is something like water dripping through a faucet with a bad washer. The circuit still works. It is easy to tell the difference between a faucet that is open with water running from one that is almost closed with water dripping.

Not only is this a problem, but the physics causes it to get exponentially worse as circuit size gets smaller. The solution is to use different material to make the chip with better insulation properties. Intel introduced such material in its current 45 nanometer family of processors, but AMD is not expected to make the change until it moves to 32 nanometers. This could be a serious problem for AMD.

Work is Heat and Heat is Work

Flanders and Swan wrote this in a song about the laws of Thermodynamics (CD). Their summary runs "Work is Heat and Heat is Work." In plumbing, it takes work to move the water up to your second floor bathroom. You don't see the work because it is being done by massive pumps at the Water Company. However, if you had to pump the water yourself, or carry it upstairs in buckets, you would immediately recognize that work is involved. As you build up a sweat, you realize that work is heat.

The smaller a circuit is, the less work is required. It requires a lot less effort to carry up the stairs enough water to fill a shot glass than it does to fill a bathtub.

Every modern CPU chip generates more heat than it can tolerate. By itself, the chip will overheat in a few seconds and stop running. A few years back, someone posted pictures on the Web of a test in which they cooked an egg on a standard CPU chip. In a real computer, however, heat is just a waste product that must be discarded.

Cooling a CPU follows essentially the same rules as cooling a car engine. You can pump liquid through a block of metal that covers the chip, then vent the heat through a radiator. Most computers today, however, opt for a simple block of metal with cooling fins (a "heat sink") that make the CPU air cooled. Anyone alive in the 60's may remember that the old Volkswagens had air cooled engines that worked the same way.

Smaller circuits generate less heat, but the heat they do generate is concentrated in a smaller area. This may require better cooling. The most recent processors require a layer of copper between the CPU and the heat sink, because copper conducts heat better than aluminum. The very best heat sinks are all copper, but that presents another problem. Copper is much heavier than aluminum, and an all copper heat sink can add more weight than the motherboard can support, particularly when the system is moved.

Overclocking

If a zero bit is represented by empty and a one bit is represented by full, then under ideal circumstances every measurement would find every circuit either entirely empty or entirely full. However, there are slight imperfections in the material or manufacturing process for each circuit. Some may fill or empty slightly slower than others. So the computer is designed with some tolerance. If the circuit is less than 1/3 full, we may treat it as "empty". If it is more than 2/3 full, we may treat it as "full". In between, the status is indeterminate.

Intel or AMD test every circuit in every chip they make. They apply a standard voltage, then fill or empty the circuits and measure them. Some chips will be nearly perfect and will operate at the highest speed. Other chips may have circuits that run a bit more slowly and take longer to fill or empty. They will be sold to run at a slower speed.

A conservative buyer will accept the vendor rating. Other users, looking for "extreme" performance, may try to squeeze more performance out of the chip by increasing the clock rate. Since there is a little slop factor left from the vendor testing, most processors will run at a 5 or 10% faster clock speed. More than that requires effort.

A faster clock doesn't really make anything in the CPU run faster. If you start to fill a circuit up with electrons, it will fill at whatever speed it operates without regard to the clock. What the clock does it is indicate when the filling process ends and when to measure which circuits are full and which are empty. If every operation competed with time to spare, then you can safely speed up the clock and shorten the time between the start of the process and the point of measurement. Eventually you shorten the period so much that one circuit is not only still filling, but has not yet reached the 2/3 full point or whatever mark generates a reliable value. Then the system crashes.

One solution is to increase the voltage by one notch (usually a quarter volt). In the analogy, voltage is like water pressure. Higher pressure means more flow, and everything fills and empties faster. Running a chip at a slightly higher voltage than recommended can compensate for running the clock at a slightly higher speed. However, it will also generate more heat.

Extreme performance fans crank the voltage and clock speed up to high values, but compensate with large, loud, expensive cooling solutions. Consider for example the Zalman water cooling system with a radiator that is bigger than most computers.

Balanced, Serial, Point to Point, One Way

At slow speed anything works, so the first computer designs used whatever options were simpler. Then you push the speed higher and higher till you hit a barrier. There are several ways to move data around in a computer at higher speed. Each solves a problem. Several new architectures combine several features to provide some improvement.

Voltage is a Difference

If you look out a window, you may see a bird sitting on top of a power line. Birds can do this because voltage isn't an absolute property but is always measured relative to something else. Electricity doesn't move through the bird because the bird isn't connected to the other power wire or to a ground. Occasionally a squirrel will touch two wires to complete the circuit.

All early computer interfaces started by assigning one wire to each bit of data or control signal. The voltage on all the signal wires would be measured relative to a single common ground wire. This works at low speed, over short distances.

The problem is using a single common ground to measure several different wires. There is some delay after the signal wires change before everything settles down and a reliable measurement can be made. Things are better if you have fewer signal wires for every ground wire. In a modern CPU, every fourth pin can be a ground connection. Still this type of structure seems to max out at around a 200 MHz clock signal.

Balanced

The solution to this problem has been known since the '60s, but was first applied to communication over long distances. Each signal is represented by two dedicated wires. To generate a signal, apply a small positive voltage to one wire and an equal negative voltage to the other wire. The receiver measures the difference between the two wires in the pair and determines which is positive and which is negative. Since the two wires have opposite signals, they exactly balance each other and produce 0 net voltage relative to any external reference point.

Balanced pairs also solve the problem of external interference. A long wire is also an antenna. Look in your AM radio and you may find that the antenna is just some ordinary electric wire run around the case. The longer the wire, the more outside signal gets picked up. Insulation blocks the flow of electricity, but radio waves pass right through it. The radio measures a signal induced on a single wire loop. However, when computers use a pair of balanced wires, any external source of interference produces exactly the same effect on each wire of the pair, and at the receiving end the two cancel out.

Over short distances, like on a mainboard, it is sufficient to run the pair of wires next to each other. Any interference they generate or receive tends to cancel out when the pair is measured against each other. Over longer distances, such as a USB or Ethernet cable, the pair of wires is twisted round each other. This prevents either wire from being "closer" all the time to either a source or recipient of interference.

The bad news is that you need more wires or pins than in the older one-pin-per-signal design. If every fourth pin used to be a ground, the pin count increases by 50% to switch to a balanced signal (3 signals require 6 pins instead of 4). However, the clock speed on the pair of wires can be increased by such a large factor that all of the new balanced pair connections end up using far fewer wires in total.

Serial (not Parallel)

If you send the same electric signal through any parallel set of wires, the electricity will move more slowly through some wires due to slight differences in the metal. The signals arrive at the other end with very small timing differences. This is called "skew". Like runner in the various lanes of a 100 meter dash, they all start at the same time and place, but they arrive at the finish line staggered by small differences in speed.

Start                                                       Finish
o------------------------------------------------------------->
o-------------------------------------------------------------->
o--------------------------------------------------------------->
o--------------------------------------------------------->
o---------------------------------------------------------------->
o-------------------------------------------------------------->
                                                     |<- Clock Pulse ->|
                                                      covers worst skew

Skew is less of a problem over short distances or low speeds. It gets worse when, as in the PCI bus, the wire is connected at points along the path to connectors on sockets into which adapter cards may or may not be plugged. Every time the signal hits a point where it is soldered to something, or where the signal splits in two directions, there will be some delay. These contacts must be manufactured for pennies, so it isn't feasible for them to be of uniform quality.

In addition to the signal wires, a parallel bus carries a clock pulse. All the data bits start out at the same time. The clock, however, cycles half way between adjacent data bits. The idea is that the clock signals the earliest point when the next bit can arrive on any wire, and the last moment when the slowest previous data bit can have arrived.

The problems with a parallel bus are problems in physics, wire, and solder. You can't fix them with faster CPU chips. Eventually, each parallel bus in the computer is replaced by something better.

The alternative has been understood for as long as there have been Personal Computers. Instead of sending one bit of data down each wire in a parallel bus, send all of the data one bit at a time down a single pair of wires. If there are slight imperfections in the wire, they effect each bit equally. The bits arrive at the other end at the same speed they were sent. This is a Serial bus.

The problem with a Serial bus is that it requires a very fast computer chip to generate and receive the signal. This, however, is a problem that is easily solved as silicon computer chips got faster and cheaper. As chip technology improved, one by one each parallel bus in the PC has been replaced by a serial alternative.

Point to Point (not really a Bus)

In computer terms, a "bus" is a communication path shared by many devices. The memory bus is a sequence of 1 to 4 slots into you can plug "DIMM" modules of memory. The PCI bus is a sequence of up to 5 slots into which you plug adapter cards. Even the IDE and SCSI disk connectors are bus cables that support more than one device.

However, each time a wire encounters a slot or connector there is an interface point that minimally increases skew and can also generate a "reflection" where the signal bounces off the interface and starts to travel back towards it source.

The solution is to redesign each connection to be "point to point" between two devices. In the newer disk technologies (SATA replacing old IDE, and SAS replacing the old SCSI) a dedicated cable with two pair of wires connects each disk to a dedicated connector on the controller card. On many mainboards today, the fastest supported memory speed is only permitted when a single DIMM is plugged into the memory slot. If you want to plug two DIMMs of memory into the same bus, you have to drop performance to a lower clock speed.

The most advanced version of this design principle is provided by HyperTransport, a new high speed chip interconnect system used by AMD, IBM, and Apple. The CPU is connected to other chips on the mainboard using a system or point to point wires. To make the system work, each chip can receive a signal from wires on one side and relay the signal bit by bit on to the next chip.

One Way

The conventional design for memory, CPU, PCI, and other chip connection technology is to use the same pins to both transmit and receive data. This requires the chip to have both "transmit" and "receive" electronics connected to the same wire.

Other systems designate one pair of wires to transmit data from the chip, and a different pair of wires to receive data on the chip. Superficially such a design appears to either require twice as many wires to get the same speed or else to cut the amount of data that can be transferred in half. That would be true if the data was going in only one direction. In practice, however, data has to flow in both directions and this design allows each chip to transmit and receive data simultaneously. Plus it allow a slightly higher clock rate.

Summary

Typically no system requires all of these features at once. Almost all new technologies use balanced pairs of wires. Otherwise, Intel's PCI Express has decided on a serial bus (not parallel, not point to point) while AMD, IBM, and Apple like HyperTransport which is a parallel point to point (not serial, not bus) system.

The following table shows some PC serial connections that have been or will be replacing older parallel connections:

Serial

Replaces

Timeframe

USB printer

Parallel Printer cable

2000

Ethernet

[nothing]

1990s

Firewire

[nothing]

2000

Serial ATA

80 wire ATA cable

2003

Serial SCSI

various SCSI cables

2004

PCI Express

PCI bus

2004

ExpressCard

PCCard/Cardbus

2005

Copyright 1998, 2005 PCLT -- Introduction to PC Hardware -- H. Gilbert