ETM -- A Program Exception and Termination Manager

Paul DuBois
dubois@primate.wisc.edu

Wisconsin Regional Primate Research Center
Revision date: 10 April 1997

Table of Contents


Introduction


This document describes Exception and Termination Manager (ETM), a simple(-minded) library to manage exceptional conditions that arise during program execution, and to provide for orderly program shutdown.

There are at least a couple of approaches one may adopt for handling error conditions within an application:

*
Have functions always return a value and have all callers test the return value and respond accordingly.
*
Force the program to give up and exit early.

Each approach has strengths and weaknesses. A difficulty with the first is that actions composed of many subsidiary actions, each of which may themselves succeed or fail, can easily become very unwieldy when an attempt is made to handle all possible outcomes. However, such a program will also continue in the face of extreme adversity.

An advantage of the second approach is that it is, conceptually at least, simpler to let a program die when a serious error occurs. The difficulty lies in making sure the program cleans up and shuts down properly before it exits. This can be a problem especially when a program uses a number of independent modules which can each encounter exceptional conditions and need to be shut down, and which may know nothing of each other. ETM is designed to alleviate the difficulties of this second approach.

The general architecture assumed for this discussion is that of an application which uses zero or more subsystems which may be more or less independent of each other, and which may each require initialization and/or termination. Also, other application-specific initialization and/or termination actions may need to be performed which are unrelated to those of the subsystems, e.g., temporary files created at the beginning of the application need to be removed before final termination, network connections need to be shut down, terminal state needs to be restored.

Ideally, when an application executes normally, it will initialize, perform the main processing, then shut down in an orderly fashion. This does not always occur. Exceptional conditions may be detected which necessitate a "panic" (an immediate program exit) because processing cannot continue further, or because it is judged too burdensome to try to continue.

An individual subsystem may be easily written such that a panic within itself causes its own shutdown code to be invoked. It is more difficult to arrange for other subsystems to be notified of the panic so that they can shut down as well, since the subsystem in which the panic occurs may not even know about them.

An additional difficulty is that some exceptions may occur for reasons not related to algorithmically detectable conditions. For instance, the user of an application may cause a signal to be delivered to it at any time. This has nothing to do with normal execution and cannot be predicted.

The goals of ETM are thus twofold:

(1)
Panics triggered anywhere within an application or any of its subsystems should cause orderly shutdown of all subsystems and the application itself.
(2)
Signals that normally terminate a program should be caught and trigger a panic to allow shutdown as per (1).

Processing Model


The model used by ETM is that the application initializes subsystems in the order required by any dependencies among them, and then terminates them in the reverse order. The presumption here is that if subsystem ss2 is dependendent upon subsystem ss1, then ss1 should be initialized first and terminated last; the dependency is unlikely to make it wise to shut down ss1 before ss2.

ETM must itself be initialized before any other subsystem which uses it. The initialization call, ETMInit(), takes as an argument a pointer to a routine which performs any application-specific cleanup not related to its subsystems, or NULL if there is no such routine.

Each of the subsystems should then be initialized. A subsystem's initialization routine should call ETMAddShutdownProc() to register its own shutdown routine with ETM, if there is one. (Some subsystems may require no explicit initialization or termination. However, if there is a shutdown routine, you should at least call ETMAddShutdownProc() to register it.)

When the program detects an exceptional condition, it calls ETMPanic() to describe the problem and exit. ETMPanic() is also called automatically when a signal is caught. A message is printed, and all the shutdown routines that have been registered are automatically executed, including the application-specific one.

ETM is designed to handle shutting down under unusual circumstances, but it also works well for terminating normally. Instead of calling ETMPanic(), the application calls ETMEnd(). This is much like calling ETMPanic(), except that no error message is printed, and ETMEnd() returns to the caller. which takes care of calling all the shutdown routines that have been registered.

It is evident that the functionality provided by ETM is somewhat like that of the atexit() routine provided on some systems. Some differences between the two are:

*
atexit() is either built in or not available. ETM can be put on any system to which it can be ported (extent unknown, but includes at least SunOS, Ultrix, Mips RISC/os and THINK C).
*
ETM is more suited for handling exceptional conditions.
*
ETM shutdown routines can be installed and removed later. atexit() provides only for installation (although you could simulate removal by setting a flag which shutdown routines examine to see whether to execute or not).

Here is a short example of how to set up and shut down using ETM.

   
   main ()
   {
     . . .
     ETMInit (Cleanup);  /* register application-specific cleanup */
     SS1Init ();         /* registers SS1End() for shutdown */
     SS2Init ();         /* registers SS2End() for shutdown */
     SS3Init ();         /* registers SS3End() for shutdown */
     ... main processing here ...
     ETMEnd ();          /* calls SS3End (), SS2End () and SS1End () */
     exit (0);
   }

Subsystems that are themselves built on other subsystems may follow this model, except that they would not call ETMInit() or ETMEnd().

If there is no special initialization or shutdown activity, and you don't care about catching signals, it is not necessary to call ETMInit() and ETMEnd(). The application may still call ETMPanic() to print error messages and terminate. (Even if the application does use ETMInit() and ETMEnd(), it is safe to call ETMPanic() before any initialization has been done, because nothing needs to be shut down at that point yet.)

If ETM itself encounters an exceptional condition (e.g., it cannot allocate memory when it needs to), it will--of course--trigger a panic. This should be rare, but if it occurs, ETM will generate a message indicating what the problem was.

Caveats


Shutdown routines shouldn't call ETMPanic(), since ETMPanic() causes shutdown routines to be executed. ETM detects loops of this sort, but their occurrence indicate a flaw in program logic. Similarly, if you install a print routine to redirect ETM's output somewhere other than stderr, the routine shouldn't call ETM to print any messages.

kill -9 is uncatchable and there's nothing you can do about it.

Programming Interface


The ETM library should be installed in /usr/lib/libetm.a or local equivalent, and applications should link in the ETM library with the -letm flag. Source files that use ETM routines should include etm.h. If you use ETM functions in a source file without including etm.h, you will get undefined symbol errors at link time.

The abstract types ETMProcRetType and ETMProcPtr may be used for declaring and passing pointers to functions that are passed to ETM routines. By default these will be void and void(*)(), but on deficient systems with C compilers lacking void pointers they will be int and int(*)(), the usual C defaults for functions.

These types make it easier to declare properly typed functions and NULL pointers. For instance, if you don't pass any shutdown routine to ETMInit(), use

   
   ETMInit ((ETMProcPtr) NULL);

If you do, use

   
   ETMProcRetType ShutdownProc () { . . . }
   . . .
   main ()
   {
     . . .
     ETMInit (ShutdownProc);
     . . .
   }

Descriptions of the ETM routines follow.

   
   ETMProcRetType ETMInit (p)
   ETMProcPtr     p;
   
   ETMProcRetType ETMEnd ()
   
   ETMProcRetType ETMPanic (fmt, ...)
   char *fmt;
   
   ETMProcRetType ETMMsg (fmt, ...)
   char *fmt;
   
   ETMProcRetType ETMAddShutdownProc (p)
   ETMProcPtr     p;
   
   ETMProcRetType ETMRemoveShutdownProc (p)
   ETMProcPtr     p;
   
   ETMProcRetType ETMSetSignalProc (signo, p)
   int  signo;
   ETMProcPtr     p;
   
   ETMProcPtr ETMGetSignalProc (signo)
   int  signo;
   
   ETMProcRetType ETMSetPrintProc (p)
   ETMProcPtr     p;
   
   ETMProcPtr ETMGetPrintProc ()
   
   ETMProcRetType ETMSetExitStatus (status)
   int  status;
   
   int ETMGetExitStatus ()
   
   ETMProcRetType ETMSetAbort (val)
   int  val;
   
   int ETMGetAbort ()
   int  val;