首页    期刊浏览 2024年09月04日 星期三
登录注册

文章基本信息

  • 标题:HP 9000 Series 700 input/output subsystem - Technical
  • 作者:Daniel Li
  • 期刊名称:Hewlett-Packard Journal
  • 印刷版ISSN:0018-1153
  • 出版年度:1992
  • 卷号:August 1992
  • 出版社:Hewlett-Packard Co.

HP 9000 Series 700 input/output subsystem - Technical

Daniel Li

Integrated on a single 8.5-by-11-inch I/O board is hardware support for the SCSI, the Centronics parallel printer interface, two RS-232 ports, the IEEE 802.3 LAN, the HP-HIL, four audio tone generators, and a real-time clock. An application-specific IC serves as I/O subsystem controller.

In today's environment of ever-increasing CPU performance, it is critical that I/O subsystem performance keep up with CPU performance. If I/O subsystem performance cannot keep up with CPU performance, the system will become I/O bound and will not benefit from increased CPU performance. The goal in designing the I/O system for the HP 9000 Series 700 workstations was to design a balanced high-performance system with many built-in features and yet still keep the system cost low.

To increase performance, the core I/O subsystem is attached directly to the high-bandwidth, pipelined system bus. The high-speed I/O devices, such as the SCSI (Small Computer System Interface) and the Ethernet local area network (LAN), perform DMA (direct memory access) data transfers to and from the system memory with very low latency. This not only greatly reduces the chance for LAN and SCSI controllers to overrun their internal buffers, but also minimizes the use of available system bus bandwidth and frees bandwidth for graphic devices and I/O expansion slots.

A low-cost CMOS ASIC (application-specific integrated circuit) chip called the I/O controller implements the logic that controls the interface between the I/O subsystem and the system bus, thereby minimizing the need for interface logic.

I/O System Features

In the HP 9000 Series 700 workstations, a set of I/O functionality is integrated on an 8.5-by-11-inch system I/O board. The following is a list of the built-in I/O system features:

* SCSI with DMA scatter/gather capability

* Parallel interface with DMA capability (bidirectional with HP ScanJet support)

* Two high-performance, asynchronous RS-232 ports

* Ethernet LAN with DMA capability

* HP Human Interface Loop (HP-HIL)

* Four audio tone generators (internal and extrenal capabilities)

* Two 128K x 8-bit ROMs containing self-test, boot, console handler, CPU, and I/O firmware code

* Real-time clock with lithium battery backup

* 8K x 8-bit EEPROM nonvolatile memory.

I/O Subsystem Overview

Fig. 1 is a block diagram of the I/O subsystem. Inside the I/O subsystem there are two buses: a local data bus and an address bus. All functional blocks are located between these buses and use some portion of the local data bus. Depending on its specific requirements, a given block may or may not use some portion of the address bus.

The core I/O subsystem is attached to the system bus. Data communication passes through a 32-bit bidirectional tristate register. I/O addresses pass through a 30-bit bidirectional tristate register, making the I/O subsystem capable of performing master DMA operations.

If necessary, byte addressing during DMA operations is done to get into word alignment or to finish a transfer that does not end on a word boundary. The I/O controller chip automatically handles word alignment.

A set of bidirectional tristate buffers (74245s) attached to the local data bus and to the system bus interface data registers makes word assembling and data byte disassembling possible.

Bus Controller Interface

Except for the interrupt request signal, the signals needed by the core I/O subsystem to interface to the memory and system bus controller chip are all defined in the system bus specification. The interrupt is asserted by the I/O controller chip on behalf of the devices inside the core I/O subsystem. The interrupt is deasserted after a CPU read to the I/O controller chip's interrupt register.

Direct Memory Access (DMA)

There are three bus masters that use DMA on the I/O subsystem board: the LAN, the parallel interface, and the SCSI. Inside the NCR53C700 SCSI controller, there is a bus master DMA device which is capable of moving data between disk and system memory at the rate of 27.7 Mbytes/s. This assumes the NCR chip is running at 33 MHz, has a burst size of 24 bytes, and has an arbitration overhead of 13 system bus clock cycles. The Intel 82596 LAN controller also has a built-in high-performance DMA controller. Inside the I/O controller chip, a 32-byte FIFO register and a DMA channel support bidirectional parallel printer interface applications, such as the HP ScanJet.

The I/O controller chip uses the system bus request and bus grant signals on the system bus to gain access to the bus for each device. If multiple bus requests occur simultaneously, the I/O controller chip will arbitrate access for one bus cycle, allowing each requesting device to master the bus according to its priority.

SCSI

The SCSI (Small Computer System Interface) is a system-level interface bus used to connect disk drives, tape drives, and other I/O devices to a computer system. Numerous workstations today support this bus standard, and it is becoming the de facto disk interface standard.

The Series 700 I/O subsystem will support the SCSI II specification. Currently, it supports the 8-bit data bus, running single-ended at 5 Mbytes/s.

The NCR53C700 intelligent SCSI controller chip is used in the core I/O subsystem. On the host bus side, it has an onchip 32-bit DMA engine and a "script processor," which fetches its own commands and performs SCSI transactions with minimal host processor intervention. A small portion of the I/O controller chip is used to implement the logic that controls the interface between the NCR53C700 and the system bus.

A concern about the SCSI compared to device-level interfaces is the amount of latency the SCSI control logic adds to the total subsystem overhead. The Series 700 SCSI subsystem is designed to keep this overhead to a minimum. The script processor inside the NCR53C700 minimizes SCSI I/O start latency; it takes only 500 ns to begin, compared to 2 to 8 ms for a traditional SCSI controller. The NCR53C700 can make decisions based on phase changes on the SCSI bus and compare specific data values. This minimizes the number of interrupts to the processor, which may take more than several hundred microseconds to execute and can be a large source of performance loss.

In the HP-UX* operating system, a disk data transfer may be broken up and data buffers may be scattered throughout system memory. The latency to reinstruct the DMA operation can result in a missed disk revolution. The performance degradation resulting from this data scattering is minimized by the scatter/gather feature of the NCR53C700 chip.

The combination of the fast system bus transfer rate, the fast SCSI bus transfer rate, and the efficient architecture of the NCR53C700 chip enables us to achieve a high disk I/O transfer rate without the need to place a large private buffer between the SCSI controller chip and the system bus interface. This not only lowers the system cost but also avoids the complexity and latency of managing the buffer, thus maximizing the disk I/O throughput.

Local Area Network (LAN)

The Series 700 workstations implement a built-in LAN that conforms to the IEEE 802.3/Ethernet standard. The LAN circuitry consists of the Intel 82596DX-82C501AD chip set, plus a transceiver chip and associated circuitry. The Intel 82596DX is an intelligent, high-performance 32-bit LAN controller. The 82C501AD device provides the electrical interface to the transceiver cable (AUI or built-in Cheapernet MAU), generates a 10-MHz transmit clock for the LAN controller, and performs Manchester encoding and decoding of the transmitted and received frames.

The 82596DX has large on-chip FIFO buffers, 128 bytes for receive and 64 for transmit. It also provides a four-channel DMA controller to communicate directly with the system memory via the high-performance system bus interface. The low memory access latency and the large on-chip FIFO practically eliminate overrun and underrun without using an external FIFO or dedicated packet buffer memory.

The 82596DX bus interface is optimized for the Intel386 microprocessor bus. The similarity between the system bus and the Intel386 bus made the control circuitry to interface the two extremely simple; for the LAN-specific portion of the I/O controller chip, the total gate count is less than 100. Unlike some older-generation controllers, the 82596DX system clock rate is asynchronous to the 10-MHz Ethernet clock so that the controller, the I/O controller chip, and the system bus all run at the same frequency. For DMA operations, the I/O controller chip arbitrates system bus access on the 82596DX's behalf, manages the address valid/ready handshake, and controls the address and data buffers. The I/O controller chip also controls access to the 82596DX's CPU port.

The four-channel DMA controller manages memory structures automatically using command chaining and bidirectional data chaining. This allows autonomous block data transfers and greatly reduces the CPU overhead. The four channels are: CU (transmit header), TXD (transmit data), RU (receive header), and RXD (receive data).

A permanent copy of the LAN station nodal address is kept in location 0 of the nonvolatile RAM (EEPROM). The onboard processor dependent hardware status register has three bits for LAN connector status: one for MAU power/fuse and two for ThinLan/ThickLan selection; it also has three bits for the SPU ID.

The hardware design is IEEE 802.3 compliant and Ethernet revision 2 compatible. A removable jumper selects either the AUI or BNC connectors. The LAN heartbeat is indicated by a front-panel LED to facilitate diagnostics.

Bidirectional Printer Interface

A majority of computers communicate with output devices such as printers and plotters via the Centronics parallel interface. Centronics is an 8-bit parallel, synchronous interface with additional control signals from the host computer and status signals from the peripheral.

The Centronics interface was defined many years ago. With the passage of time, the demand for higher transfer rates motivated some companies to modify the Centronics standard so that it transfers data faster in a stream mode without the overhead of a shared handshake for each data byte transferred. Some products' Centronics interfaces do not have overlapping strobe and busy handshake signals. Yet another variation of the Centronics interface is the bidirectional HP ScanJet interface. In the output mode, this interface is compatible with the Centronics standard. However, in the input mode, the peripheral becomes the master and the busy and strobe signals change direction.

The Series 700 parallel interface is designed to support all these variations of the Centronics interface. Additional functionality is implemented by special logic built into the I/O controller chip, and the Western Digital WD16C552 chip is used at the parallel port as the device driver/receiver interface.

The Series 700 I/O subsystem supports automatic hardware handshaking at the parallel interface, which removes the burden from software and improves performance. However, the software is capable of controlling all handshakes directly.

The Series 700 hardware also adds DMA capability to the parallel interface to increase the performance. Inside the I/O controller chip, a parallel-port DMA controller and a FIFO buffer are implemented. The FIFO buffer supports 32-byte inbound and 32-byte outbound data transfers. The hardware is capable of a 380-kbyte/s data transfer rate.

Serial Channel Communications

There are two RS-232-C serial ports in the I/O subsystem. They are fully compatible with the National NS16550A chip. Host communication to the serial ports and the status and control registers is done through the I/O controller chip. Each serial port uses a 16-byte inbound and a 16-byte outbound FIFO data buffer to increase efficiency.

The interface supports a 19.2-kbaud inbound data transfer rate with no data loss, using only software XON/XOFF flow control. Up to 230.4 kbaud inbound is supported with no data loss using hardware flow control. Hardware flow control is implemented by additional logic and controls the RTS line to the peripheral, thus preventing data overrun errors in the input FIFO buffer or input holding register. This feature is intended for use with high-speed serial devices that are capable of quickly suspending serial data transfers to the host when the RTS line is dropped by the host interface controller hardware.

Processor Dependent Hardware

The processor dependent hardware includes the boot ROM, EEPROM nonvolatile memory, status switches, and status LEDs, as well as the 8042 slave subsystem, which consists of the real-time clock, audio generator, and HP-HIL.

Two sockets are available for boot ROM. Initially, each socket can hold a 128K x 8-bit ROM, but the sockets are wired so that they are also capable of holding 512K-byte parts if necessary. Thus, there are 256K bytes of boot ROM available, expandable to a maximum of 1M bytes.

The I/O subsystem has an 8K x 8-bit EEPROM which may be used for storing system configuration status and any other alterable, nonvolatile information. The manufacturer of the EEPROM guarantees reliability of at least 10 years with a maximum total number of write cycles of 10,000 for any given byte.

The 8042 subsystem is taken from the HP Apollo Series 400 workstations and is composed of several devices: battery-backed real-time clock, system timers, user timers, audio generator, and HP Human Interface Loop (HP-HIL). These devices are controlled via an Intel 8042 slave microprocessor which acts a server for these devices. Access to the devices is through the 8042 command/data protocol under control of the I/O controller chip. The HP-HIL supports eight input devices (keyboard, mouse, tablet, etc.) with a single system connector.

Porch Board

Since there are so many I/O connectors, a double-height bulkhead is needed to hold them all. The porch board is the means by which the signals get to the top level of the bulk-head. An 80-pin connector carries signals from the system board up to the porch board. The I/O connectors on the system I/O board are SCSI II, LAN AUI, and LAN BNC. The I/O connectors on the porch board are HP-HIL, audio, Centronics, and the two RS-232 ports. Other parts on the porch board are for ESD protection and EMI prevention.

I/O Controller Chip

The I/O controller chip is a 15,000-gate ASIC controller for the I/O subsystem. It interfaces with the memory and system bus controller chip and the system bus. Besides control functions, it contains registers for subsystem interrupts, a DMA channel for the parallel printer interface scan path logic, some parallel printer interface related logic including a 32-byte bidirectional FIFO buffer, system bus address decoders, and other miscellaneous registers. The I/O controller chip is housed in a 160-pin PQFP (plastic quad flat package) with a large number of power and ground pins to minimize simultaneous switching noise. Two pins on the I/O controller chip are designated for scan-chain testing. The I/O controller chip also provides a tristate test mode, which facilitates board production testing. In the tristate test mode, all of the outputs and bidirectional pins of the I/O controller chip are tristated (in a high-impedance state, instead of being driven high or low). This test mode helps isolate the I/O controller chip while testing other parts on the board.

Fig. 2 shows the I/O controller chip block diagram. The I/O controller chip is the heart of the I/O subsystem. Its main functions are I/O subsystem address decoding, bus arbitration, interrupt control, peripheral device handshaking, and data flow control. The I/O controller chip interfaces with the memory and system bus controller chip, the system I/O controller, and the I/O subsystem devices. The I/O controller chip is designed to work at frequencies from 25 to 33 MHz with no modifications. It provides the following resources and functions:

* System bus interface and arbitration

* Address and data bus transceiver control

* Western Digital serial/parallel chip control

* DMA channel for parallel printer interface

* Intel 82596DX LAN chip control

* NCR53C700 SCSI chip control

* Intel 8042 control (HP-HIL, audio, real-time clock)

* Processor dependent hardware control (ROM, EEPROM, status registers, LEDs)

* Interrupt registers.

Arbitration.

The need for arbitration arises because the address and data buses for SCSI, LAN, and the I/O controller chip are multiplexed together on the I/O system board. The address buses of these devices are connected together directly. For the data buses, the I/O controller chip's 8-bit data bus first goes through the steering logic (74245s) and then connects with the SCSI and LAN data buses. This 8-bit data bus is also multiplexed with the Western Digital chip (WD16C552) and other I/O devices.

System Bus Interface.

The system bus interface block of the I/O controller chip combines the graphics interface and byte packing functions. For master (outgoing) transactions, the system bus interface is responsible for enabling and clocking the external address and data buffers and the I/O controller chip address out drivers, and for managing the system bus address valid/ready protocol. It also handles error conditions.

For slave (incoming) transactions, the system bus interface block detects the start of transactions to the core I/O system and generates the external signal for latching addresses in the external buffers. This is one of the most critical timing paths in the I/O controller chip and the core I/O system. The system bus protocol and tight timing make detecting and latching new valid core I/O addresses difficult. Furthermore, the address must be held steady for the length of the transaction and released in time to look at new addresses. The system bus interface decodes the upper eight bits of the incoming address to determine if the current transaction is to the core I/O system. The system bus interface also controls byte packing in the core-I/O-to-system-bus direction and byte unpacking in the system-bus-to-core-I/O direction for the 8-bit devices in the core I/O system.

Address Decoder.

The address decoder serves two main functions: selecting slave units by decoding incoming system-bus addresses, and synthesizing address bits 1 and 0 for devices that need them.

In general, only as much decoding as necessary is done to determine which unit is being addressed. Consequently, there is much aliasing. The upper eight address bits do not come into the address decoder. These are used instead by the system bus state machine, which generates a master select enable selecten. To prevent select signals from being generated from spurious or invalid addresses, all the decodes are ANDed with the selecten signal. Since the address is latched externally, it is guaranteed valid and stable for the entire time selecten is asserted.

Interrupt.

Inside the I/O controller chip there are an interrupt request register (IRR) and an interrupt mask register (IMR) similar to the interrupt structure of the PA-RISC architecture.

An interrupt pending register (IPR) is also provided. All of these registers appear to be 32 bits wide and are accessed as such. However, only the 15 least-significant bits are implemented for each register. The remaining bits are not affected by writes and are always read as zeros.

The possible sources of the defined interrupts are:

* NMI from EISA

* 8042 general interrupts

* 8042 high-priority interrupts

* Reserved

* Reserved

* WD16C552 SIO 1

* WD16C552 SIO 2

* WD16C552 parallel printer interface

* LAN

* SCSI

* EISA

* Graphics1

* Graphics2

* SIO

* Domain keyboard.

The interrupt pending register (IPR) is used to latch incoming interrupts and indicate that they are pending. The external interrupts are synchronized in the IPR and an active edge on any synchronized signal causes the corresponding IPR bit to be set to 1.

The interrupt mask register (IMR) is a read/write register used to mask pending interrupts. A 1 in an IMR bit enables the corresponding pending interrupt to create an interrupt request.

External devices must assert interrupts for just over two CPU clock cycles (one I/O subsystem cycle) to be synchronized and detected. The interrupt must also deassert for at least two CPU clock cycles (one I/O subsystem cycle) for the next assertion to be recognized.

SCSI.

Because of timing and loading constraints on the system bus, part of the I/O controller chip is used to implement the control logic between the NCR53C700 SCSI controller and the system bus. The SCSI interface block primarily handles the control signals while the system bus interface controls the address and data buffers between the system bus and the NCR chip.

For the address and data path, the address buffers (74543s) are in transparent mode for SCSI master transactions and in latch mode for slave transactions. The data buffers (74646s) are in transparent mode for master and slave write transactions and in latch mode for master and slave read transactions.

LAN.

Part of the I/O controller chip is also used to implement the control logic between the Intel 82596DX LAN controller and the system bus. The LAN subblock controls the Intel 82596DX. The slave subblock controls slave operations (port and channel attention) and master subblock interfaces to the system bus on master operations.

A slave mode transaction is one that is initiated by the processor. The three slave mode transactions to the 82596DX are channel attention, port select, and hard reset.

When the 82596DX wants the bus for DMA master transactions, it asserts HOLD. The I/O controller chip LIOARB block arbitrates among the core I/O masters (LAN, SCSI, and Centronics DMA) and requests the system bus. The I/O controller chip then asserts HLDA to the 82596DX and lets the 82596DX keep the bus as long as it wants to (until HOLD is deasserted). The I/O controller chip also enables the address buffers from the core I/O system to the system bus.

Parallel Port and DMA.

The Western Digital WD 16C552 chip is used at the parallel port as the device driver/receiver interface. The Centronics compatible features and the additional functionality including that required by the HP ScanJet products are controlled through the I/O controller chip.

The parallel port's DMA controller inside the I/O controller chip is really a bidirectional data accumulator that masters transactions on the system bus. Its main purpose is to better match the relatively low-bandwidth parallel port to the high-performance system bus and give the port direct access to memory without affecting CPU performance very much. It buffers data for both the read and write functions of the built-in parallel port. It collects data from the system bus and sends it to the parallel port and vice versa. The system bus connection is a 32-bit-wide bus and the parallel port connection is 8 bits wide, so in addition to dealing with the high-speed system bus interface the DMA controller also does byte packing.

I/O Verification

The I/O verification strategy was divided into two parts; a stand-alone I/O system test using a tool called the bus exerciser, and a complete system test using the actual CPU, memory and system bus controller chip, and memory models. The stand-alone verification strategy will be discussed here. The full-system verification strategy is covered in the article on page 34.

The bus exerciser is a tool that verifies the correctness of the I/O subsystem design in the behavioral simulation environment. The bus exerciser is written in the MADL description language (see "Simulation Toolset," page 36). The bus exerciser is a kind of substitute for the CPU and the memory and system bus controller chip in that it executes test programs and generates corresponding transactions on the system bus to stimulate the I/O subsystem and check the correctness of the response from the I/O subsystem.

There were three main reasons for using the bus exerciser to verify the I/O subsystem design. First, the CPU, memory and system bus controller chip, and memory models were not available in the early phases of the I/O controller chip design. Using the bus exerciser eliminates the need for stable working models of the CPU and the memory and system bus controller chip before testing the rather independent I/O subsystem. Second, the bus exerciser models the CPU, the memory and system bus controller chip, the EISA card, and the memory subsystem in a simplified fashion. Thus simulation in the bus exerciser environment is about ten times faster than simulation with the CPU and memory and system bus controller chip models. Third, when running with the bus exerciser, we do not see cache misses and TLB misses. All the DMA data setup and result checking are done through the bus exerciser's virtual hardware in one clock cycle rather than using the CPU model running PA-RISC instructions to do data setup and result checking. As a result, we can issue much more intense and simultaneous I/O activities on the system bus, which helps stress the I/O subsystem.

Fig. 3 is a block diagram of the bus exerciser. There are five main sections:

* Instruction array (100 instructions deep)

* Memory (4K bytes)

* System bus controller

* Self-checking logic

* Stress test state machines.

Instruction Array.

To simplify the design, the bus exerciser implements an array that holds up to 100 instructions. The instruction array is initialized through guide vectors before the simulation begins. The bus exerciser can execute 1-, 2-, or 4-byte reads and 1-, 2-, or 4-byte pipelined or nonpipelined writes.

Each instruction in the instruction array is 80 bits wide. A simple programming language was developed to simplify test development. The program contains several fields. The first field specifies whether the operation is a read or a write. The second field specifies the address from which to read or to which to write. The next field specifies the data expected on a read or the data to write on a write operation. A fourth field might be used on a write operation to specify pipelined mode. An example of the programming language is as follows (the lines beginning with # are comments): ##### INTERRUPT TEST (int.inst1) # Read interrupt pending register r f0800008 0 # Read interrupt mask register r f0800004 0 # eable interrupt mask w f0800004 ffffffff p # master clear wb f082f000 0 # read interrupt mask reg r f0800004 0

This programming language is translated to guide vectors using a simple HP-UX shell script. The resulting file contains 80 bits of information:

Bits 31:0 store the data information and bits 63:32 store the address information. Bits 79:76 are the opcode, such as read word or write byte. Bits 75:72 give pipeline information for the write operations. Bits 71:68 give information on whether or not to compare the read data. On a read instruction, if the expected data is known, then that data can be included in the data field (bits 31:0).

The table below further describes the instruction bits:

inst[79:76]

0000 : read word

0001 : write word

0010 : read byte

0011 : write byte

0100 : read 2 bytes

0101 : write 2 bytes

inst[75:72]

0000 : no-pipeline write

0001 : pipeline write

inst[71:68]

0000 : no-read data compare

0001 : read data compare

Listed below is an example of the first five entries of an instruction array corresponding to the sample program above.

input busex/inst[0] 0 %h0010f080000800000000 input busex/inst[1] 0 %h0010f080000400000000 input busex/inst[2] 0 %h1100f0800004ffffffff input busex/inst[3] 0 %h3100f082f00000000000 input busex/inst[4] 0 %h0010f080000400000000

Memory.

Inside the bus exerciser model is a memory array of 1024 entries by 32 bits. This is provided mainly because the SCSI and LAN models need memory to perform their DMA operations. The memory block shared between the LAN and the CPU is also hardwired into the memory model to minimize the tedious LAN DMA setup.

System Bus Controller.

The system bus controller is the "brain" of the bus exerciser. The bus controller is implemented in a large, complex state machine. The bus controller fetches and executes instructions from the instruction array according to the system bus protocol. The bus exerciser is the bus master and the I/O controller chip is the slave while instructions are being fetched and executed. Instructions are fetched and executed until the instruction array is empty or until an I/O device requests the bus. When an I/O device requests the bus, the bus exerciser becomes the slave and the I/O controller chip is the bus master.

Listed below are the system bus functions supported by the system bus controller. Some functions were described in the "Instruction Array" section above. In this context, "host" is the bus exerciser and "guest" is the I/O system controller.

* One-byte, two-byte, or four-byte host-generated pipelined write

* One-byte, two-byte, or four-byte host-generated nonpipelined read/write

* One-byte, two-byte, or four-byte guest-generated pipelined read/write

* One-byte, two-byte, or four-byte guest-generated nonpipelined read/write

* Interrupt handling

* Error generation and erro handling

* Bus arbitration.

Self-Checking Logic

Extensive self-checking capability was added to the bus exerciser model to minimize human intervention. For instance, the test program can specify the expected data for each read instruction. Upon completion of the read transaction, the bus exerciser will check the returned data with the expected data and raise an error flag in an inconsistency occurs.

For LAN, SCSI, and Centronics DMA operation, the bus exerciser will remember the data transferred from the memory to the I/O devices and later perform self-checking while the previously transferred data is looped back from the I/O device to memory.

Stress Mode.

Normally a test will be written in the bus exerciser programming language, converted to guide vectors, and run on the I/O system model. The tests are relatively short since the address queue can only hold up to 100 instructions. These short tests are usually used to test a particular device, such as the HP-HIL, LAN, or SCSI.

However, situations arise in which we would like to use the bus exerciser to generate I/O events randomly to stress the I/O subsystem over a long period of time. The bus exerciser uses stress mode to run randomly generated programs that can run for any specified length of time. The goal is for the bus exerciser to create different corner cases randomly to stress the I/O subsystem, and to perform the necessary self-checking without human intervention.

Five state machines were added to the bus exerciser model for the stress tests. Three of the state machines are DMA state machines for the LAN, SCSI, and Centronics devices. The state machines contain blocks of code that can be put into the instruction queue and executed by the bus exerciser. The blocks of code tell the DMA devices in the I/O subsystem to do a DMA read or write. In the case of the SCSI, for example, the bus exerciser's SCSI state machine can send a block of code to the instruction queue which, when executed, causes the SCSI device to perform a DMA read from memory. Once the SCSI device finishes the DMA read, it sends an interrupt back to the bus exerciser. Next, the bus exerciser's SCSI state machine sends a block of code to the instruction queue, which causes the SCSI device to perform a DMA write back to memory. Once the SCSI device finishes the DMA write, it sends another interrupt back to the bus exerciser. Once the round trip DMA has been completed, the bus exerciser's SCSI state machine compares the data read from memory with the data written back to memory. If the comparison is not successful, an error is flagged. This process is repeated in a loop and the program can run for any length of time. During each loop, the DMA state machine gets its DMA transfer sizes and starting addresses randomly from a range of selections. An example of how the bus exerciser's SCSI state machine operates is illustrated in Fig. 4. The LAN and Centronics state machines behave in a similar fashion and all three state machines operate simultaneously but asynchronously. This results in a stressful simulation environment.

In addition to the three DMA state machines there are two other state machines that make the system even busier. These state machines also send blocks of code to the instruction queue, but the instructions access non-DMA devices. The bus exerciser issues a highly mixed pattern of instructions to the I/O subsystem by using this combinat ion of three DMA state machines and two non-DMA state machines.

Although the bus exerciser was used as a debugging tool, it in no way replaced full system verification. It was still necessary to simulate operation with the CPU and the memory and system bus controller chip to catch system-level bugs. However, more than 95% of the I/O controller chip logic bugs were caught in the bus exerciser environment before integration with the CPU and memory and system bus controller chip models. The result was a clean I/O controller chip that worked the first time and required no chip turnarounds.

Acknowledgements

The authors wish to acknowledge the efforts of other I/O subsystem design team members: Wayne Ashby, Ping Hao, Danny Lu, Rob Snyder, Frank Lettang, and project manager Andy De Baets. Additional thanks to Alan Wiemann and his verification team members: Robert Lin, Steve LaMar, Ali Ahi, and Greg Burroughs.

COPYRIGHT 1992 Hewlett Packard Company
COPYRIGHT 2004 Gale Group

联系我们|关于我们|网站声明
国家哲学社会科学文献中心版权所有