CMOS Soft Errors and Server Design

Douglas Bossen
IBM Corporation

The occurrence of soft errors in today’s CMOS technology is at odds with the high availability requirements of server customers. This presentation will outline the design methodology and a variety of fault masking techniques that are in use in today’s servers to eliminate the effect of soft errors.

Douglas Bossen

Douglas C. Bossen is a Distinguished Engineer specializing in computer hardware fault tolerant design. He has designed and holds numerous patents on error correcting codes, error detecting logic, error detection/fault isolation techniques and system availability techniques which are implemented in IBM systems 308X, 3090, ES/9000, and pSeries Servers. Since joining IBM he has received 2 Outstanding Innovation Awards and a Corporate Award for his work on error detection and fault isolation. His current position in Server Group RAS Architecture requires focus on Server Group RAS competitive leadership by migrating the most cost-effective techniques within IBM’s extensive portfolio into the P-Series and I-Series brands during product development. He has published 18 peer reviewed journal articles, 24 issued US Patents, 9 pending, 23 published and has received IBMs 10th Invention Plateau. He received the B.S., M.S., and Ph.D. in electrical engineering, all from Northwestern University. He is a Fellow of the IEEE and a member of IBM’s Academy of Technology.