2008-06-29 14:20:28

There's no crash like /bin/bash

I recently encountered a rather ugly bug in the nss_ldap module. The bug was rather easy to spot: if the nss_ldap.conf file was not readable, any call to getpwent(3) and similar functions would cause an immediate segmentation fault (access to a NULL pointer).

However, the way I learned about the problem was slightly weird. Some of the unprivileged users on my server used bash as their login shell, and I started noticing a lot of bash processes using up lots of CPU time in a tight loop. I attached a trace process to it and noticed that bash was caught in an endless loop calling the SEGV handler.

Inspecting the situation a bit closer, I noticed that bash, in its signal handler – which is defined for various different signals which are supposed to terminate the process – made calls to save the bash_history file, unset its signal handler and re-deliver the signal to itself.

There is a large number of things wrong with this code. Firstly, there is no protection against the same signal handler being called twice for different signals. If the second handler runs while the first is still running, this will lead to some cleanup functions being executed twice. Worse than that, if on non-BSD systems free() is called twice on the same buffer there is a possibility to get arbitrary code executed. However, according to the Bash people, this is not a problem because it would require the user to exploit his own shell.

Worse than that, bash also uses the getpwent API in the signal handler. This caused the segmentation fault handler to segfault, leading to an endless loop of segmentation faults.

Evidently, after the emacs X11 input handling problems, the GNU people still didn't learn that signal handlers are not the proper place to call non-reentrant functions.

Posted by Tonnerre Lombard | Permanent link | File under: programming