Förderverein Informationstechnik und Gesellschaft


6 April 1999. Thanks to John Ganter Source: for full report (50K).

                                         Unlimited Release
                                        Printed February 1999
                                      Document information and

Managing Errors to Reduce Accidents in High Consequence Networked Information Systems

John H. Ganter Decision Support Systems Software Engineering Sandia National Laboratories P. O. Box 5800 Albuquerque, New Mexico 87185,

This paper is based on a presentation at the Workshop on Information Assurance and Trustworthy Networks, held by the Cross Industry Working Team (XIWT) and Bellcore in Washington, D.C., 17-18 November 1998.


Computers have always helped to amplify and propagate errors made by people. The emergence of Networked Information Systems (NISs), which allow people and systems to quickly interact worldwide, has made understanding and minimizing human error more critical. This paper applies concepts from system safety to analyze how hazards (from hackers to power disruptions) penetrate NIS defenses (e.g., firewalls and operating systems) to cause accidents. Such events usually result from both active, easily identified failures and more subtle latent conditions that have resided in the system for long periods. Both active failures and latent conditions result from human errors. We classify these into several types (slips, lapses, mistakes, etc.) and provide NIS examples of how they occur. Next we examine error minimization throughout the NIS lifecycle, from design through operation to reengineering. At each stage, steps can be taken to minimize the occurrence and effects of human errors. These include defensive design philosophies, architectural patterns to guide developers, and collaborative design that incorporates operational experiences and surprises into design efforts. We conclude by looking at three aspects of NISs that will cause continuing challenges in error and accident management: immaturity of the industry, limited risk perception, and resource tradeoffs.


Introduction Concepts for Describing Failures and Accidents in Systems Some Terms for Describing Human Effects in Systems System Defenses and Accident Trajectories Paradoxical Defenses: Defenses that Have the Potential To Be Hazardous Defenses Throughout the System Lifecycle . The design phase . The operations phase . Maintenance phase

Continuous Safety Management Challenges in NISs Conclusions References