UNIX Signals
 

 What is a signal? A signal is a software interrupt.
 A signal is asynchronous event that the application must deal with. Asynchronous means we do not know when it will occur.
 The purpose of signals is to provide a way processes can synchronize with one another.
 
 We will describe POSIX reliable signals, which standardizes the signal handling in UNIX.
 Now we write a program and terminate it with Ctrl-C key, or with a kill command.
 
 Signal concepts Every signal has a name and a number. All the names start with SIG.
 All the numbers can be found in <signal.h>, or with 'man 7 signal', or check out the table on page 266.
 No signal has the number 0. This is reserved for a null signal.
 
 Signal can be generated in many circumstances. Typing a special key
 Hardware exception
 kill system call
 kill command
 Other conditions that a process should be aware of. For example, one of the process's children terminates.
 
 
 What a process can do when a signal arrives. There is no way a process can predict when it will receive a signal, since signal is asynchronous. Ignore the signal. Two signals are never ignored -- SIGKILL and SIGSTOP.
 
 Catch the signal. Write a signal handling routine and do whatever you want there.
 
 Let the default action apply. UNIX kernel has a set of default action for different signals. Terminate with coredump
 Terminate
 Ignore
 
 Page 266 has a complete list of the default actions.
 Now we test the signal processing using kill command.
 
 
 A list of important signals
 SIGABRT
 SIGALRM
 SIGBUS
 SIGCHLD
 SIGFPE
 SIGINT
 SIGKILL
 SIGQUIT
 SIGSEGV
 SIGSTOP
 SIGTERM
 SIGUSR1
 SIGUSR2
 
 
 Signal API signal Although implied by its name, this routine does not send out signal (kill does). It simply install (or more formally, register) a handler routine for a particular signal.
 The signature is complicated. We will use the following "simplification". typedef void Sigfunc(int); A Sigfunc is a function that takes a integer as the only argument, and does not return anything.
 
 Sigfunc *signal(int, Sigfunc *); The function 'signal' takes two arguments. The first argument is an integer.
 The second argument is a pointer to a type Sigfunc function we defined earlier.
 The return value is another pointer to a type Sigfunc function.
 
 
 Now put all these together. The function "signal" install a handler (the second argument) for a particular signal (the second argument), and returns the previous handler as the return value.
 
 The <signal.h> header file provides three constants for handler. SIG_DFL Apply the default action.
 
 SIG_IGN Ignore the signal.
 
 SIG_ERR Error flag for "signal" routine.
 
 
  Now we test the textbook example sigusr.c. Note that pause function will suspend a process until a signal arrives.
 The program installs a handler for SIGUSR1 and SIGUSR2, then waits for signals.
 The routine errdump terminates the process with a coredump.
 
 When a process 'exec's another, the installed user-defined signal handlers are all reset to default action. Without the old program text, how do you expect the user-defined handlers remain working?
 
 When a process forks another, the installed user-defined signal handlers remain working.
 
 
 Terminology and Semantics Generated A signal is generated when the event that causes the signal occurs.
 
 Delivered A signal is delivered to a process when the action for that signal is taken.
 
 Pending A signal is pending from it is generated until it is delivered.
 
 Blocking A process can block the delivery of a signal by mean of masks. When a signal is blocked, it remains pending.
 A pending signal can be delivered either the process ignores the signal, or unblock the signal.
 
 If more than one signal is pending, there is no guarantee that which one will be processed first.
 If more than one signal of the same type is pending, it will be treated as a single one or multiple ones? POSIX allows both.
 
 
 kill and raise API kill Send a signal to a process.
 The sender can specify the process id to which the signal will go. pid > 0 Send to the process with given id.
 
 pid = 0 Send to all processes with the same process group id as the sender.
 
 pid < 0 Send to all processes with the given process group id.
 
 
 Permission to send signal The super user can send signals to anyone.
 Other users can send to processes with the same effective or real user ids that matches the sender's effective user id.
 0 is the null signal, and often is used to determine whether a process is still alive.
 
 
 raise  Send a signal to oneself.
 
 
 alarm and pause API alarm Deliver a SIGABRT to the given process after a specified number of seconds.
 Only one alarm clock exists. If alarm is called before it is expired, the new setting replaces the old one and the number of seconds till the previous alarm is returned.
 
 pause Stop the process until a signal is received, and the routines returns when the signal handler for the caught signal returns.
 
 Check out the sleep implementation using alarm (sleep1.c). The sleep routine will make the calling process idle for a given number of seconds.
 If the sleeping process is awakened by a signal, the number of unslept seconds should be returned.
 There are some problems with this program. If the caller use alarm for other purpose, the value is lost.
 The caller may have installed his own signal handler for SIGALRM, so we need to restore that.
 A race condition between the call to alarm and pause. If the SIGALRM happens between them then no one will wake up the process when it is in pause. We need signal mask for these problems. See below.
 
 
 
 
 Signal Sets and signal masks Before we talk about signal mask we must be able to group the signals that we want to process as a set.
 Signal set processing API sigemptyset Make a empty set.
 
 sigfillset Make a signal set that has all the signals.
 
 sigaddset Add a signal into a set.
 
 sigdelset Delete a signal from a set.
 
 sigmember Determine whether a signal is in a set or not.
 
 All these API are implemented with bit string.
 
 Signal mask API sigprocmask This function modifies the signal mask of a porcess.
 There are three operation modes (indicated by the "how" argument). SIG_BLOCK Block the signals in the given signal set.
 
 SIG_UNBLOCK Unblock the signals in the given signal set.
 
 SIG_SETMASK Set the mask to a given value.
 
 
 This function also outputs the original mask through a pointer.
 Refer to the pr_mask routine for an example.
 
 sigpending This function returns the set of signals that are blocked (though a pointer).
 
 Now check out the example on page 295 (critical.c). The program first installs a signal handler for SIGQUIT.
 Then the program constructs signal set with SIGQUIT in it, and uses sigprocmask to block SIGQUIT.
 The program then sleeps for 5 seconds. During this period, no signal is delivered (they are blocked).
 Then the program gets the mask and print a message if there is SIGQUIT being blocked.
 Finally the mask is restored and as soon as this happens, the signal is delivered
 Note that the signal handler is set to the default action within sig_quit.
 
 
 sigaction This function is similar to "signal", but it can access the signal handler without replacing it.
 The arguments are the following. An integer to indicate the signal number
 An input sigaction structure to indicate the information that we want to impose.
 An output sigaction structure for the information that we retrieved.
 
 The interface consists of a special structure called sigaction, which has the following: The address of the handler
 A signal mask This mask will be automatically added when the signal handler is invoked, and removed when the handler finishes.
 The signal that invokes the handler will be automatically added into the mask. As a result this mechanism guarantees that the signal handler will not be interrupted again by the same signal.
 
 
 Now check out the implementation of "signal" using "sigaction" (page 298).
 
 sigsuspend This function sets the mask to a given value and wait for signal interrupt. These two operation are done atomically.
 The purpose of this function (see the example on page 303) is to provide a way to unblock a signal and wait for it in a single step.  This could be used to implement a critical section.
 The signal mask is set to its original value before the sigsuspend call if the handler returns.
 Test the textbook example (suspend1.c). First the program constructs a mask containing SIGINT, which is set to be the new mask.
 Then the program unblocks and waits for signal.
 The output from pr_mask indicates that in the critical section, SIGINT is blocked.
 In the SIGINT handler, the SIGINT remains blocked, since at the beginning of the handler, the mask is restored.
 Finally we restore the old mask.
 
 Use sigsuspend to synchronize processes (page 307). The synchronization involves three functions. TELL_WAIT Two signals are used in the synchronization process -- SIGUSR1 and SIGUSR2.
 This function initializes the signal handlers. Note that both signals use the same signal handler.
 The initialization blocks both SIGUSR1 and SIGUSR2, and store the original mask in the variable old_mask.
 
 TELL_PARENT and TELL_CHILD These functions send SIGUSR2 and SIGUSR1 to the parent and child respectively.
 
 WAIT_CHILD and WAIT_PARENT These functions wait for the signals from child and parent respectively.
 These functions use a spin loop to wait until the global variable is set to non-zero by the interrupt handler.
 If the sender executes first the signal will be blocked until the receiver is ready.
 If the receiver executes first then it will wait until the sender unblocks (with the zero mask). After the handler returns the global variable sigflag must be reset back to 0.
 
 
 
 
 
 A correct implementation of sleep (page 318) Now we are ready to implement sleep.
 The sigaction installs and examines the handler address.
 The sigprocmask manipulates the mask so that we can block the SIGALRM when we want to.
 The sigsuspend unblocks and waits for SIGALRM.
 the race condition in the previous implementation is resolved.
 The semantic of sleep is as follows: If the amount of time has elapsed, the function returns 0. The alarm function returns 0, just as we want it to.
 
 Otherwise the function returns the number of unslept seconds. The alarm function returns the number of seconds till it is supposed to wake up.
 
 
 
 Abort function The function is similar to alarm, but it sends a SIGABRT to itself.
 This function cannot return even if a user defines a handler that will (required by ANSI C).
 The purpose of abort is to cleanup when the process exits.
 Now examine the implementation on page 311. First the program check the current handler for SIGABRT, and replace SIG_IGN with SIG_DFL if the user tries to ignore the SIGABRT.
 Then the program checks for the mask, and unblock SIGABRT if the mask does block it.
 Send a SIGABRT to itself.
 Check if the handler returns. If it does, replace the handler with SIGABRT and this makes sure the following SIGABRT will kill the process.
 
 
 System function POSIX requires that the system function ignores SIGINT and SIGQUIT, and blocks SIGCHLD.
 First see the example on page 312. In this example, SIGINT and SIGCHLD are handled by the main program, which launches an ed process by system.
 When the ed child process terminates, it sends a SIGCHLD to the main program. This is not acceptable since the main program may be waiting for other children to return -- it cannot tell whether the child is from the system call or from its fork.
 
 When we type INT character in ed all the foreground processes receive this signal, including the main program. This should not happen since the main program executes ed with system, so it does not want to be bother by the signal.
 The solution is to ignore SIGINT and SIGQUIT from the caller of system.
 
 
 Now check out the implementation on page 314. SIGCHLD is blocked.
 SIGINT and SIGQUIT are ignored.
 After forking the child, the handlers for  SIGINT and SIGQUIT, and the mask are restored. Then the child executes the command using sh.
 After the child (now running sh) returns, everything is restored.
 We change the disposition of these signals to avoid a race condition. If we fork first and then change the disposition, the child running sh may send signals before the parent is prepared.
 
 
 Interrupted system calls A signal could be "slow". For example, it could be waiting for I/O to complete (read or write), or waiting for its child process to terminate (wait and waitpid).
 When a system call is interrupted by a signal, the error number errno is set to EINTR and the system call returns.
 To handle the possible interruption by signals, the programmer needs to code like the case on page 276, which is very tedious and error-prone.
 If the system can restart itself when interrupted by a signal, we will not to do it in our program. This was introduced by BSD 4.2.
 The automatically restarted system calls include ioctl, read, write, writev, wait, and waitpid.
 Figure 10.2 indicates the ability to restart system calls by various signal functions.
 Now we examine the program on page 298. Here we see that if SA_RESTART is defined, we always restart our system calls.
 The function alarm can be used to limit the time a user can input. See the code on page 289. In this code we first set up an alarm of 10 seconds.
 If the user did not do anything, the alarm comes and the read is interrupted.
 If the user does something before the alarm comes, the alarm(0) will turn off the alarm.
 Now it is clear that we do not wish to restart a system call when the signal is SIGALARM.
 Now go back to page 298, and we see exactly what we just described.
 
 
 Reentrant functions A reentrant function is one that could call itself in the middle of its own execution. What is NOT a reentrant code. Code that uses static data structure.
 Code that calls malloc
 Code that refers to standard I/O library
 
 In fact all these reasons are the same -- they all use a single global data structure.
 
 We should not call functions that are not reentrant in a signal handler. The reason? We could be interrupted again and go into another handler instance, which will call the non-reentrant functions and cause errors.
 
 We also should save the errno in a signal handler, since the interrupted process might want to know its value.
 Now check the program on page 280. The main program execute a infinite loop. During each loop it checks for inconsistent user name.
 The inconsistency will happen when the SIGALARM interrupts the main program between the getpwnam and strcmp, since the handler calls getpwnam with a different user name.