Fast turnaround of a structured custom IC design using advanced design tools and methodology - HP IMACC chip - Technology Information - Technical
Rory L. FisherThrough the use of several new tools and methodologies, a small team of engineers was able to design and verify a 1.7-million-FET chip in eight months. The tools and methodologies used included a set of guidelines and timing constraints that were met by the customer, a data path compiler, a highly tuned custom multiplier cell that was used in 87 locations, and an automated top-level power connection scheme.
The HP IMACC chip was developed to provide image processing capabilities. The initial target application is medical imaging with geological applications as a potential area of expansion. The graphical capabilities of IMACC include spatial filtering, edge detection and enhancement, image pan and zoom, image rotation, and window and level control. IMACC consists of three major components:
* The convolver circuit has a 3 x 3 programmable kernel* and can perform low-pass or high-pass spatial filtering, edge enhancement, and other functions.
* The interpolator is an implementation of a 4 x 4 bicubic convolution kernel.(*) The interpolator can be configured to perform pan, zoom, and rotation.
* A RAM-based lookup table is used for windowing and leveling of image pixel intensities.
In support of the various user-selectable operating modes, any or all three of the functional blocks may be active at a time. The order of operations can be changed as desired, with the single limitation that the convolver must precede the interpolator if both modules are in the chain. When the image visualization accelerator board (IMACC is the heart of this board) is attached to the HP HCRX graphics subsystem, simultaneous convolution, zoom, rotation, and window and level control of 1024-by-1024 pixel, 16-bit medical images at 40 frames per second are possible. The accelerator can process more than 40 million pixels per second independent of the number or order of internal operations.
Customer Interaction
As a result of our experience in designing numerous ICs for various customers, our laboratory has developed some practical, informal guidelines for designing ICs. At the beginning of the IMACC project, we met with the customer (another HP laboratory) and discussed these guidelines along with project goals. The guidelines we provided to our customer are as follows:
* The prime directive: Signal groups such as multistate drivers on a single bus, multiple set signals into a flip-flop, or multiple set signals into a multiplexer, may cause drive fights and therefore need to be completely decoded from the current state of the control machine so that one and only one will fire. This requirement must hold even if the chip comes up in a random state. Exceptions to this have caused significant delays in schedule right before tape release.
* Signals require a consistent naming convention.
* Update flip-flops on the falling edge of the clock (single-edge timing).
* When glitchless values are required (Gray code counters, etc.), they must come directly out of flip-flops. Resets are typically heavily loaded and will probably cause timing problems. Have each control block latch its own version of the chip reset, then generate its own local reset. This helps timing at the expense of latency.
* Don't design multistate paths. Complete timing analysis of such a path is not possible in any design tool.
* Don't set and dump the same FIFO or RAM location at the same time.
* Keep Synopsys blocks small.
* Keep large register files in the data path. Use no clock uncertainty (skew) in Synopsys. It will be there, but is better allowed for by reducing the period. Don't allow Synopsys to try to fix hold problems. There should be none by design.
* When setting your timing constraints, allow some slack for RC delays, the local clock generator, and incremental delays that will be introduced when actual routing capacitances are substituted into the timing model. For example, 15 ns might be a good period constraint at 60 MHz.
* As constraints (timing and loading) become more accurate, make sure to continue to update them in your design. Accurate is better than conservative in Synopsys.
* Simulate at the board level as soon as possible.
* Simulate timing between blocks as soon as possible (schematic simulations are fairly accurate).
* Simulate the chip coming up in random states as soon as possible. A proven way to do this is to make sure the chip can come up with unknown values in all memory elements, including flip-flops and registers.
The highest-priority design goal was to have a working IMACC system to demonstrate at an upcoming conference. We created a schedule consistent with this goal incorporating the necessary checkpoints. Two of the most important checkpoints: * We were to deliver a top-level, schematic-based Verilog gate model to the customer so they could begin regression tests at the system level. This allowed them to identify design problems early. * The customer was to freeze the function of major data path blocks by a scheduled date. This allowed us to construct the artwork in a single pass.
Setting a rigorous schedule as the first priority forced a streamlined design. Using a single clock domain made the design less complicated. A large data path block with non-critical functionality was eliminated because it would require too much design time. The die size was determined early and a previously characterized package was used. Timing budgets were kept conservative to ensure that IMACC would run at 45 MHz after parasitic loading was added.
Since all decisions were based on meeting our primary objective, "creeping featurism" was eliminated. The customer was informed of the schedule impact that the addition of a new function implied. In most cases they were not willing to suffer a postponement of the chip release date.
We also shortened the design time by making sure that when we sent them a new gate model of the chip, we had all the latest changes and we had no problems with the syntax of the model. To do this we used several tools (awsim, eval, etc.) to check for connectivity problems. We used an in-house history management system to maintain revision control of these gate models.
Custom Multiplier
One of the key pieces of circuitry on IMACC in terms of design leverage was an integer multiplier. The imaging algorithms that we implemented typically executed a large sum of product terms. As an example, the convolver block in IMACC sweeps a 3 x 3 matrix across the source image (a 2D array of pixels), multiplies the nine coefficients in the matrix by the corresponding pixels, and adds these nine products to compute a single new pixel value for the target image. Therefore, this block alone required nine multipliers and eight adders. With the customer's help we found that we could consolidate most of the multipliers into one design. The result is that in the IMACC design, a single 18 x 18 integer multiplier circuit is used 87 times.
It then became very important to make this multiplier as dense as possible. This problem was attacked on two levels. First, an area-efficient architecture was chosen, and secondly, the key multiplier cell was painstakingly designed and laid out to reduce its area as much as possible.
The architecture we chose was a radix4 (two bits at a time) Booth-encoded array.[1] A standard multiplier array in our case consisted of 18 rows, each row representing the result of one of the multiplier bits times the whole multiplicand value. Since the Booth-encoded array looks at two bits of the multiplier times the whole multiplicand in each step, this cut the number of rows in half to nine. It turned out that the increase in the row height resulting from a more complex unit cell was well below the height of two of the standard rows. (However, there is additional circuitry that needs to be added to perform the Booth encoding of all of the sets of two multiplier bits.) As a side benefit, cutting the multiplier rows in half also increased the circuit's speed.
This speed-up benefit helped with our second task of reducing the area of the critical multiplier cell. Our design requirement was to execute the 18 x 18 multiply in a single clock state (22 ns). When the circuit was initially built, it executed the multiply in about two thirds of the required time (about 14 ns). Therefore, it was possible to shrink the critical cell by reducing the FET sizes until the cell delay increased the multiply time to the 22-ns limit. Along with some diligent artwork layout, this produced an extremely dense cell.
Our final 18 x 18 Booth-encoded multiplier is 864.0 [Micro]m wide by 307.8 [micro]m high for a total area of 0.266 [mm.sup.2]. Using the standard multiplier array, this multiplier would have been 648.0 [micro]m wide by 446.7 [micro]m high for a total area of 0.289 [mm.sup.2]. Considering just the areas of the multipliers, the savings per multiplier is 0.023 [mm.sup.2] and the total savings for IMACC is 2.0 [mm.sup.2]. However, the aspect ratio of the standard multiplier did not fit well with the IMACC data path blocks (18 bits wide versus 24 bits wide). Taking into consideration the empty spaces that the standard multiplier would have produced, the savings per multiplier increases to 0.123 [mm.sup.2] and the total savings for IMACC increases to 10.7 [mm.sup.2].
The custom multiplier design cut schedule time for two reasons. Using it 87 times in numerous blocks across the chip saved us the time we would have spent assembling (and possibly designing) multipliers for each specific need in the different blocks. Secondly, the significant area savings we realized made the job of top-level chip routing much easier and consequently faster, since the top-level blocks were smaller.
Data Path Compiler
We were able to save many hours of artwork layout time through the use of Dpc14,[2] a data path artwork generation tool that places and routes data path blocks. A data path is made up of hierarchical horizontal slices, often called macro cells, that are usually, but not always, bit-wise logic repeated as many times as needed. Examples of macro cells are a two-input, 32-bit register, an 18-bit ripple carry adder, another data path block, or a custom data path block. The program dpc_tiler was used to build many of these macro cells. Macro cells can be placed vertically or horizontally with respect to one another. Routing is done over and between macro cells to connect signals within the data path. The user has the ability to control placement and how signals are routed. The user can assign signals to specific data path tracks or defer routing of signals such that the allow layers for the signals (layers that can be used later in the design to route signals) are placed on unused tracks.
To use Dpc14, we first generated a Dpc14 input file, which was created from the schematic Block Description Language (BDL) file. The Dpc14 input file could also be generated from a Verilog netlist file. The tool bdl2dp was used to generate the input file from the schematic BDL for most of the IMACC data path blocks that used the schematic BDL. A simple input file is shown in Fig. 1.
[Figure 1 ILLUSTRATION OMITTED]
Artwork encapsulation information needed to be extracted for each macro cell used in the data path. The Perl script dpEncaplnfo was used to extract information about how to connect to ports within the macro cell. It was necessary to specify to dpEncaplnfo any wart cells on the left or right side of the macro cell. For instance, a typical DPLIB14 register has two wart cells on the right side of the macro cell. In this case we used dpEncaplnfo -r 2 to build the encap_info file correctly. This specified that there were two wart cells on the right side of the register macro cell.
After the Dpc14 input file was created and encapsulation information had been extracted, the Dpc14 program was used to generate an artwork archive that could be read into our artwork editor. The Dpc14 file could then be edited if the artwork needed to be modified, allowing us to make as many iterations as needed to produce the desired result.
Local Toolbox
A number of relatively simple scripts and programs of our own devising were combined into a local toolbox for the project. The more significant of these are described here.
The mkcntl script uses Synopsys, block place-and-route, and other tools to go from a Synopsys netlist through schematic to artwork with parasitics in one command. Of course, one must iterate on the place-and-route portion to ensure workable size, form factor, and port locations. An iteration can be accomplished with mkcntl-b.
We used a connectivity tracer that reads schematic (scip) BDL and reports the connectivity of the specified instance or net. The trace is limited to one level of hierarchy. For net names, regular expression pattern matching is available.
We automated a lot of the top-level power connection of IMACC using two scripts. Several steps were involved in this methodology. First, two symbolic layers were used, one for VDD and one for GND, to define where the metal-4 power buses would go. Next, the getpwrconn script used trantor to find the intersection of the metal-3 power buses with the symbolic layers and dumped them into an artwork editor archive file. Last, the gen_pg_conn script placed contact-4 contacts in the intersection areas and filled the symbolic layers with metal 4.
Our addallow script selects VDD, GND, and CLK metal-3 shapes by size and copies them to metal3.allow and contact4.allow layers, permitting connection to over-the-cell metal 4.
There are two shieldmaker scripts in our toolbox: addshield2 and addshield3. These scripts fill the unused areas of the intermediate (cross-channel) metal layer in routing channels. These areas are then tied to GND or VDD to provide crosstalk shielding for signals running the length of the channel.
Diode placing software was used to eliminate large numbers of charge collectors. This software finds traces longer than 1000 [Micro]m in a routing channel and locates areas where diodes can be added without introducing design rule check errors. This technique worked well for our project, which had a tight schedule, lots of charge collectors, and minimal timing problems.
All of the scripts that add shapes to a source block do so through intermediary block pieces to ease modification or rebuilding of the added function.
Results
The IMACC chip was demonstrated in systems at the Radiological Society of North America conference in Chicago on November 27, 1995.
The IMACC chip contains 1.7 million FETs in the HP BiCMOS14QC process operating at a 45-MHz clock frequency. It is predominantly a data path design with 98 integer multipliers performing 4 billion integer multiplies per second on 18-bit or larger operands. A breakdown by design style is as follows:
Percent of Number of Total FET Standard Cell Thousands of Style Count Gate Equivalents FETs per [mm.sup.2] Data Path 60% 497,000 22.0 Standard 10% 53,000 10.8 Cell RAM, 21% (>48K bytes) 42.7 FIFO Pads, 9% Clock, etc.
Some additional statistics for the IMACC project include: 2342 FETs/day, 8673 FETs/[mm.sup.2], and more than 550,000 standard cell equivalent gates.
Acknowledgments
We would like to thank Rich Nash for his time spent developing Dpc14 and for his timely responses to our suggestions.
(*) A kernel is a functional unit that can be repeated as needed. A 3 X 3 programmable function on a 3 X 3 array of pixels. A 4 X 4 bicubic convolution kernel performs a bicubic convolution on a 4 X 4 array of pixels.
Reference
[1.] S. Waser and M.J. Flynn, Introduction to Arithmetic for Digital Systems Designers, Holt, Rinehart, and Winston, 1982, pp. 132-137.
[2.] R. Nash and R. Martin, "Datapath Requirements for Structured Custom Design," Proceedings of the 1995 HP Design Technology Conference, pp. 411-418.
AUTHORS
6 Mixed-Signal Test Instruments
Robert A. Witte
An R&D section manager at HP's Electronic Measurements Division, Bob Witte is responsible for managing product development for test and measurement equipment. Previously he was the R&D project manager for the HP 54600 Series oscilloscopes and earlier worked as an R&D engineer in the HP oscilloscope lab. He also was an adjunct professor at Colorado Technical University, where he taught electrical engineering for several semesters. Before that, he was an R&D project manager at HP's Lake Stevens Instrument Division. He is named as an inventor in two patents, one on frequency-domain measurements and the other on HP 54645 acquisition techniques. He has authored two books, one on electronic test instruments and the other on spectrum and network measurements. He has also written numerous articles on electronic measurements and is a senior member of the IEEE. He is currently working towards a Management of Technology degree from the National Technological University. Bob received a BSEE degree in 1978 from Purdue University and an MSEE degree in 1980 from Colorado State University. He joined HP's Loveland Instrument Division in 1978. In his free time, He enjoys amateur radio (KBOCY) and outdoor activities in the Colorado mountains.
10 Testing a Mixed-Signal Design
Jerald B. Murphy
Jerry Murphy is the product manager for the HP 54600 Series waveform products. Before that he was the product manager for the HP 54120 family of microwave oscilloscopes and TDR products. He is professionally interested in developing marketing communication techniques for engineers using HP electronic instruments. He has written several articles and presented papers on topics ranging from TDR to oscilloscope measurements. Jerry was born in Tyler, Texas and received a BSEE degree from the University of Texas at Arlington in 1964. After graduating he worked in the aerospace industry before joining HP's Southern Sales Region at Richardson, Texas in 1969. He later joined the marketing group at HP's Colorado Springs Division. Jerry is married and he and his wife each have two grown children. They also have two grandchildren. One of Jerry's passions is free-flight aeromodeling. He is a member of a club devoted to this sport and also serves on the U.S. national rules committee for the Academy of Model Aeronautics. He is presently the holder of a U.S. national record for gas-powered free-flight models. He has flown competitively in Australia, Europe, Japan, and New Zealand and was the U.S. national champion in 1991. He also enjoys photography, hiking, and camping in the Colorado mountains and is active in the Boy Scouts.
13 Mixed Signal Oscilloscope Design
Matthew S. Holcomb
An R&D engineer at HP's Electronic Measurements Division, Matt Holcomb contributed to the acquisition system design of the HP 54645A/D oscilloscopes. Previously he contributed to the acquisition system ICs for the HP 54600 Series of oscilloscopes. He is professionally interested in HDL ASIC design and synthesis. He is named as the inventor in two patents on dithering and autoscale topology. Born in Ausburg, Germany, he received a BSEE degree in 1985 from Colorado State University and an MSEE degree from the California Institute of Technology in 1986. After graduation, he joined HP's Colorado Springs Division. Matt is married and has three children. He enjoys backpacking with his family and growing perennials and trees in his garden.
Stuart O. Hall
Stu Hall is a software development engineer at HP's Electronic Measurements Division, where he is responsible for embedded system software development. He contributed as a software designer on the HP 54645A/D oscilloscope project. Previously he developed software enhancements for the HP 54600 Series oscilloscope family. Earlier, as a manufacturing engineer, he developed a test system for oscilloscope manufacturing and co-authored an HP Journal article on the subject. He is professionally interested in user interface design and is named as an inventor in a pending patent on pan and zoom features. Born in Bryan, Texas, Stu received a BSEE degree from Texas A&M University in 1987. Before joining HP's Colorado Springs Division in 1988, he worked at Texas Instruments developing power supplies and motor controllers. Stu is married to another HP employee. He is a movie buff and a sports fanatic and is a resident consultant on home theater
Warren S. Tustin
As a hardware development engineer at HP's Electronic Measurements Division, Warren Tustin is responsible for the digital design of oscilloscope products. He contributed to the HP 54645A/D acquisition system and board level design. He is named as an inventor in two pending patents on displaying waveforms and on triggering techniques. He received a BSEE degree from Montana State University in 1980. After graduating he joined HPs Colorado Springs Division. Warren is married and has three young children. Both he and his wife teach Sunday school at their church. Warren also serves on the board and is active in the youth clubs. He enjoys spending time with family activities and in his free time likes doing small electronics projects.
Patrick J. Burkart
Patrick Burkart is an R&D software engineer at HP's Electronic Measurements Division. He is responsible for firmware design and development for the HP 54600 oscilloscope family and worked on the HP 54645A/D. He was awarded a BSEE degree from Villanova University in 1978 and MSEE and Engineer degrees in 1979 and 1981 from Stanford University. After graduating he worked at Raytheon Company doing microwave tube design, then later at Texas Instruments as a system engineer on an emitter location system and as a development engineer on a radar receiver subsystem. He joined HPs Colorado Memory Division in 1993 and worked on firmware and hardware improvements for tape drives. Patrick was born in Chicago, Illinois, is married, and has four children. In his free time he enjoys hiking and camping.
Steven D. Roach
An R&D engineer at HP's Electronic Measurements Division since 1988, Steve Roach has worked on oscilloscopes and oscilloscope probes. He contributed to the analog design of the HP 54645A/D oscilloscopes. He is professionally interested in analog circuit design and IC design and is named as an inventor in two patents on a precision programmable attenuator and a low-capacitance divider probe. He is a member of the IEEE and author of a chapter on oscilloscope signal conditioning in a book called The Art and Science of Analog Circuit Design. Steve was awarded a BS degree in engineering physics in 1984 from the University of Colorado and an MSEE degree in 1988 from Ohio State University. From 1984 to 1986 he worked as a computer designer at Burroughs Corporation.
23 Sustained Sample Rate
Steven B. Warntjes
An R&D project manager at HP's Electronic Measurements Division, Steve Warntjes manages hardware and software projects for the HP 54600 Series products. He has written numerous technical articles about emulators. logic analyzers, and oscilloscopes. He joined HP's Logic Systems Division in 1983 and worked on the hardware and software design of in-circuit emulators. He also did logic analysis and was the product line manager at HP's European Marketing Center. Born in Sheldon, Iowa, Steve received a BSEE degree from South Dakota State University. He is married and has three children. He is keenly interested in karate and has earned a fifth-degree black belt.
26 Acquisition Clock Dithering
Derek E. Toeppen
Derek Toeppen is an R&D engineer at HP's Electronic Measurements Division. He contributed to the hardware design of the HP 54600 Series products and the oscilloscopes reported on in this issue.
29 Logic Timing Analyzer
Steven B. Werntjes
Author's biography appears elsewhere in this section.
34 High-Sample-Rate Oscilloscopes
R. Scott Brunton
A software development engineer at HP's Electronic Measurements Division, Scott Brunton is responsible for new product development. He was the principal designer for the DSP subsystem and contributed to the system software architecture for the HP 54615B/16B oscilloscopes. In addition to software development for the HP 545xx oscilloscopes, he has done software development for the HP 165xx logic analyzer and hardware design for the HP 16520/21 pattern generator module for the HP 16500A. A member of the IEEE, he is professionally interested in software reuse and quality improvement and in system-level design and architecture and has authored two articles on these subjects. Born in Toronto, Ontario, Canada, he received a bachelor's degree in 1980 and a master's degree in 1981, both in electrical engineering from the University of Waterloo. He joined HP Laboratories in 1981. In 1989 he received an MBA degree from the University of Colorado at Colorado Springs. He is also a professional engineer in the State of Colorado. Scott is married and has three children. He is a Western Region trainer for the Boy Scouts of America and is the Referee-in-Chief of the Rocky Mountain district of USA Hockey.
37 Dielectric Spectroscopy
Hideki Wakamatsu
Hideki Wakamatsu is an engineer at HP Japan's Kobe Instrument Division. He was the project leader for the development of the HP E5050A colloid dielectric probe and is named as an inventor in three patents concerning the symmetrical structure of the probe and the correction method. For the last ten years he has worked on the design of impedance meters and has developed new types of impedance bridge circuits, which are implemented in the HP 4278A/79A, HP 4285A, and HP 4291A impedance meters. Born in Fukuoka, Japan, Hideki earned a master's degree in electrical engineering from Yamaguchi University and joined HP in 1982. He loves scientific paradox.
45 ATM Network Impairment Emulator
Robert W. Dmitroca
Since joining HP's Communications Measurements Division in 1994, Robert Dmitroca has been a hardware design engineer working on the HP Broadband Series Test System (BSTS). He worked on the hardware design. for the HP E4219A network impairment emulator Since then he has become a product marketing engineer for the BSTS. Before coming to HP, he worked at Newbridge Networks where he designed an ATM network interface card for Sun workstations. He also worked at Glenayre Electronics where he designed CPU memory cards, voice storage cards, and clock sources. Born in Edmonton, Alberta, Canada, Robert received a BSEE degree from the University of Calgary in 1987. He is married and has one daughter Running and hiking are his favorite hobbies.
Susan G. Gibson
A technical writer for HP's Communications Measurements Division (CMD), Susan Gibson is responsible for developing learning products and designing on-line help for CMD's video products. She recently created the reaming products for the HP E4219A ATM network impairment emulator and before that for the HP Broadband Series Test System. Professionally interested in user-centered design, she is a member of the Society for Technical Communication and teaches technical writing at Simon Fraser University in British Columbia. She received a BA degree in 1977 in linguistics from the University of British Columbia and the next year did postgraduate studies in linguistics at McGill University. In 1994 she received a diploma in computer information systems from Langara College and was awarded the Governor General's medal for the work she completed in the program. Before joining HP she taught adult education including ESL, literacy, and effective writing and developed ESL and literacy curricula. Susan was born in New Haven, Connecticut, and has two children. Her outside interests include Jungian psychology, fiction writing, gardening, reading, hiking, traveling, and music.
Trevor R. Hill
Trevor Hill is currently a project manager in the digital video test area at HP's Communications Measurements Division. He was the project manager for the HP 421 9A ATM network impairment emulator He is professionally interested in hardware design and is named as an inventor in a patent involving data alignment. He is a member of the Association of Professional Engineers of Alberta. He was born in Edmonton, Alberta, Canada and received a BSEE degree in 1981 and an MBA in 1986, both from the University of Alberta, Canada. Before joining HP, he worked as an integrated circuit designer at Bell Northern Research, then at NovAtel Communications.
COPYRIGHT 1997 Hewlett Packard Company
COPYRIGHT 2004 Gale Group