Power system users don't always have a clear idea of power system configurations that provide fault tolerance. As a result, users may specify a configuration beyond what's needed or intended, which increases system complexity and cost.
Fault tolerance averts host system down time due to a power system failure. The cost of down time is now too high and too complex to quantify, so fault tolerance needs no more justification for the extra expense that redundancy imposes. Therefore, fault tolerance has become a standard feature in most telecom power systems, and information technology systems — especially for Internet hosting.
Redundant Power Systems
Often called an N+1 redundant configuration, this fault tolerant power system consists of two or more paralleled power modules (SMPS or dc-dc converters) targeted to provide redundant operation. This redundancy requires that a failure of any power system module will not diminish the system's ability to supply full load performance for the host system. Accordingly, if one of the power units fails, a redundant power system must provide the host system with adequate capability to perform to its fullest extent without degradation. Further, the failure of a power module should not cause any disturbance in the dc bus beyond what it can tolerate. Therefore, unless notified (by an electronic signal) that a module has failed, the host system should not detect the event.
However, it's virtually impossible that if one module in a redundant system abruptly fails that the common dc bus outputs will be free of any disturbance, or “glitch.” A momentary drop or increase in the dc bus must be small enough and fast enough so it does not upset the host system's performance.
The Figure, on page 69, shows a 1+1 redundant power module system. Here, the host system (the load) needs 1200W for full operation, and the chosen configuration is two paralleled 1200W supplies. Each one of these modules must be capable of providing 1200W to the host. This allows any one-module failure to leave the host system fully energized by the remaining module. If the host system requires 2400W, the power system requires the addition of a third paralleled module. This 2+1 configuration, allows any two modules to supply the load.
The cost of configuring the above 1+1 system is more than double that of a single power supply of 1200W. Overall cost of this system includes the second module plus the isolation or “ORing” diodes, and a mechanical enclosure that houses the two modules. The Photo 1, above, shows a redundant system of 1200W made of three 600W modules. Note the added complexity compared with a system containing just one power supply module.
Inclusion of the ORing diodes in series with the output is critical to the N+1 configuration. They're not optional if the objective of the paralleling is redundancy. These diodes add cost, voltage drop, and heat — necessary penalties. These diodes guarantee no matter what type failure occurs in one of the power modules, it won't drag the dc bus down. Such a failure can be any component open or shorted (on primary or secondary side), including shorted output capacitors in the module. No truly redundant power system can be without ORing diodes.
Schottky diodes are advisable for the ORing function because they have a lower voltage drop compared with conventional power diodes. However, this drop may be too high for a 5V or 3.3V output with current over 10A. In that case, a FET with low on-resistance should be used instead. A 5mΩ FET drops only 250mV for a 100A output in contrast with 750mV for a Schottky diode. The power saving is 50W, 10% of the output power, and a potential problem in high-density packages.
Use of an ORing FET can be tricky, due to its ability to conduct in both directions — especially if the module employs synchronous rectification. Here, the functioning modules in the N+1 system can pump current into the failed power module as long as the ORing FET permits such reverse current flow. The synchronous rectifier in this mode acts as a booster to pump energy from the common output dc bus to the primary side of the failed module. As a result, the extra current drain on the remaining redundant system modules may cause sagging or overheating.
You must ensure that the ORing FET does not get any gate voltage if the module fails, causing it to conduct in reverse. To avoid this problem, use an arrangement that produces the gate voltage for the ORing FET from a dedicated secondary of the power transformer. If the module fails, shuts down, or is willfully inhibited, then the gate can't receive any drive and the FET will be turned off.
Related to system fault tolerance is hot swap, which refers to a power conversion system in which you can manually insert and extract power supply modules while the system is energized. In this case the “hot” implies “under power,” but it does not mean that the module inserted or extracted has to be in the ON mode during the process. Actually, most power supply manufacturers require that the ON/OFF switch in that module be in the OFF mode during insertion or extraction. You should turn the switch on only after full engagement of the connectors at insertion. You should turn it off before attempting to extract the module. This avoids sparking of the connector pins at the moment of contact or separation.
Hot swap systems are not limited to a specific number of modules. The concept is mechanical in nature and implies redundancy, although most redundant power systems are by necessity hot swap in design. In hot swap, the attendant can extract a faulty module and swap it with a good one while the system is running — without interrupting the host system.
Hot swap configurations require power supply modules to have special input/output connectors. Thus, the hot swap enclosure needs mating connectors to accept the module and firmly mate with it. You may need one, two, or more connectors to facilitate this connection of all inputs and outputs. The connectors must have some mechanical “float” for alignment into the mating part. The “float” should be adequate for all the possible mechanical tolerances that exist in the module and enclosure. Photo 2 shows a hot swap system with four power modules.
In today's telecom and data processing systems, the current consumption from low voltages such as 5V, 3.3V (or lower) ranges from 30A to a few hundreds amps. Therefore, the connector used in the hot swap must be rated properly and conservatively. Hot swap power systems are usually equipped with mechanical guiding mechanisms that permit blind insertion of the module into its slot, and smooth travel to the full engagement position. Once engaged, captive screws or latches should secure the module in place. No hot swap module should rely only on the connector's mating force to maintain engagement. Front panel mechanical fasteners must secure it in place and guarantee engagement to the fullest extent. Often, captive hardware requires no tools to facilitate mounting the module's front panel to the frame of the hot swap system.
Finally, to facilitate easy extraction, it's wise to include handles in the front panel of the module. These handles in some cases are rather elaborate, containing latches or microswitches.
As a minimum, the power supply module's front panel in a hot swap system should contain an ON/OFF switch (or circuit breaker), an “output good” lamp indicator (or LED). Other alarms or indicators such as overload, overtemperature, and overvoltage should exsist. On occasions, there can also be an audible alarm or LED display. These features permit the user to recognize a faulty module at a glance.
Because these hot swap systems are usually left unattended, it's important you report any fault within the power system immediately via remote communication. Therefore each module should provide a “power good” signal (TTL compatible, open collector, relay contact or any other signal method) indicating its status. The signal transferred via RS232 or GPIB to the host system monitoring activity, will remotely alert it to the failure, identify the failed module, and trigger an immediate corrective action sequence.
Sometimes the user requests an “inhibit” feature that makes it possible to turn any module within the hot swap system ON or OFF. This inhibit signal is useful for testing the system for redundancy by inhibiting one module at a time and for turning the entire system ON and OFF. As an option, hot swap systems often provide an “enable” pin in the connector. This pin is short and therefore engages last — i.e., after the power pins and the output pins make contact. When the enable pin touches its mating pin, it contacts ground, enabling the module, and turning it ON. This prevents sparking of power contacts when the unit is inserted or extracted, because it guarantees that the unit is OFF at insertion. Because the enable pin will lose contact before the power pins, it also guarantees that the unit is turned off prior to power pins disengaging during extraction.
Use of various length pins in hot swap systems is common and useful to facilitate proper sequence engagement and to avoid sparks at the time of contact or disconnect. Most important is that the safety ground pin of the module is the longest, thus engaging first and disengaging last.
Users can confuse paralleled and redundant power supply configurations. Remember that all redundant power systems include power modules in parallel with series ORing diodes in the output. If one power module fails, the rest will fully supply the host system.
You may parallel power supply modules to increase output current and power. For example, a 5V at 240A may be made from one power supply, but it can also be made from two modules of 120A each, or three of 80A each. Therefore, assume that paralleling power supplies does not automatically imply redundancy since there may be no ORing diodes and the loss of any one module in the paralleled combination may drag the system down.
Power systems users may employ multiple small power units in parallel instead of one more powerful model because of mechanical, electrical or cost considerations. One large power supply may not fit in the host system enclosure, while multiple smaller ones may fit. Further, units in parallel may provide a degree of redundancy although not as absolute and definitive as in redundant systems. Depending on the mode of failure, any one module in the string of parallel modules will not necessarily drag the dc bus down. For example, if the failure is on the primary side the input fuse may blow, although the unit is actually dead in place, it will not drag the others down. That way the system provides a limited degree of redundancy.
Further, it's common to parallel several encapsulated dc-dc converter modules to get the necessary power. These devices are small and fit on the p. c. card, whereas room may not be available for a larger unit.
The total cost of multiple modules may sometimes be lower than that of one module, because of a quantity price break or standardization. The big module may be expensive and sourced from only one or two manufacturers. On the other hand, smaller power modules may be sourced from many sources, or may be produced by the user company itself as a building block for all of its products.
Regardless of which module is chosen for the parallel combination, one thing is crucial: the degree of current sharing between the modules. In the ideal situation, there's perfect sharing and all modules contribute equally to the load. On the other hand, if poor sharing prevails, some modules may conduct much more than others.
Adjusting individual power supplies to the exact output voltage should, in theory, cause them to share current equally when paralleled. However, the output impedance of each power supply is extremely low, and even a small difference in the output voltage between the parallel modules will cause the one that is a few millivolts higher to hog all the current. The lower the output voltage of the module, the more severe this problem. Therefore, when paralleling 5V and 3.3V high current modules, the chance for a reasonable degree of natural sharing is minimal. However, current sharing of 28V and 48V units is feasible and active current sharing is not mandatory.
Also, current sharing is enhanced when using ORing diodes, because they build up a voltage drop and tend to mitigate small difference between the power modules.
Further enhancement of the sharing is possible using the “droop method,” which makes the load regulation of each module a little sloppy by purposely reducing the loop gain. This causes the output voltage to drop (more than usual) with increased output current, thus the name: “droop method.” When paralleling two or more modules with the droop feature, the one that tends to supply high current will realize a higher drop in its output voltage. As a result, the other modules with lower share and higher output will tend to pump more current to the load. A certain equilibrium point is then reached where a high degree of sharing (say 60:40 or 70:30) can be realized. Table 1 shows current sharing between two power supplies that have slightly different output voltages. Table 2 lists the current sharing characteristics with the power supplies adjusted to the same voltage before they were paralleled.
Note that in the extreme case, as any module reaches its capacity limit, its current limit circuitry activates, causing its output voltage to drop and increase current flow from the other modules to the load.
Unpredictability occurs when two paralleled SMPS don't have active current sharing. This creates a dependency on component choice, temperature drifts, and how equal the cable drops from each module to the load — making “natural sharing” undesirable. An arguably better situation can occur if active, high precision circuitry within the modules has the responsibility for equalizing the current supplied by the modules in parallel or in redundancy. This leads to the concept of “active current sharing.”
Active Current Sharing
You can add a relatively simple circuit in each module to sense that module's output current and to equalize it to the currents provided by the other modules. The circuit may be made of discrete components or used as a dedicated IC such as the UC 3907. In both arrangements, only one wire connecting between the module's “current share” pin is sufficient to provide a high degree of current sharing.
To a large extent, active current sharing guarantees current equalization between the power supply modules. Active current sharing to within 10% of each other is feasible and practical.
Current sharing circuits work by slightly adjusting the output voltage to increase or decrease — causing that module to supply more or less current into the common dc bus. If sharing lowers the output voltage, it causes this module to supply less and the other module to supply more. If it increases the module's output voltage, it causes its output current to increase, and that of the others in parallel to drop.
It's common to include active current sharing between single output modules in the hot swap or redundant architecture, and in theory, there is no limit to how many units can be paralleled. Photo 3, on page 71, shows a laboratory setup where 20 power supplies of 600W each (containing active sharing circuitry) were paralleled in a non-hot swap fashion to form a 12kW, 28V power system at a small fraction of the cost of one 12KW power supply. This makes for inexpensive burn-in system that can grow as the need for power grows.
Active current sharing facilitates good paralleling of like units, regardless of whether they have ORing diodes. The major benefit from this feature is that the modules in the paralleled string are equally stressed and not sitting idle. This improves overall reliability and transient response. The failure of one module in a two module redundant system will create more disturbance on the common dc bus if it was conducting all the current prior to the failure. Such a situation is possible with no active current sharing, as opposed to conducting only 50% of the load prior to failure.
Active current sharing imposes some prerequisites that you must satisfy. First, the current level of each module must be “informed” of the other modules in the paralleled string. This requires the connection of one or two wires between all the modules in the string. Therefore, you must devote a dedicated pin in the I/O connector to this function. You should connect these pins in parallel between all the participating modules. The connection by wire or p. c. board can pick up noise: you should keep them short and away from the power bundle. As a recommendation, use twisted pair, one from the current share pin and one from the logic ground, and chain these wires between all the modules. Twisted pair (or a shielded wire) helps reduce the noise pickup.
Also, the active current share circuit should not have a wide correction range. If it does, the modules will oscillate on and off. For example, the modules may have been adjusted individually to a certain output voltage, but the adjustment is not absolutely equal, and there is a voltage deviation from module-to-module. Another consideration concerns the modules in the paralleled string that don't have an equal cable length from their output to the load (where the sense wires are connected). For the current share circuit to work properly, the difference in adjustment of the individual modules should be no more than 0.5% and the length of wires from each module to the load must be equalized. Otherwise, the current share circuit cannot function properly.
The above considerations pertaining to the functional range of the current share circuit imply that the user of the power system should refrain from tampering with the module adjustments. Once the individual adjustment for one module is made, the other modules should be individually adjusted as well under the same conditions, otherwise the current share circuit loses the range in which it can correct. For this reason, many redundant power systems manufacturers don't make the module adjustment easy for the user to access.
Active current sharing is not restricted to modules with only one output. In fact, a module with several outputs in parallel with other like modules can have current sharing for each output. The added complexity and cost makes it necessary to use active current sharing only for the high current outputs. For instance, a module that produces +5V at 40A, +3.3V at 30A, +12V at 10A and -12V at 1A, should have active current sharing only for the +5V, +3.3V and +12V outputs. The low current -12V output does not justify active current sharing and can be without it.
For more information on this article, CIRCLE 337 on Reader Service Card