How to handle SIGSEGV, but also generate a core dump

Recently I ran into this problem. How do you capture SIGSEGV with a signal handler and still generate a core file?

The problem is that once you have your own signal handler for SIGSEGV, Linux will not call default signal handler which generates the core file. So, once you got SIGSEGV, consider all that useful information about about origin of the exception, lost.

Luckily, there’s a solution. Here’s what I did.

You start with registering a signal handler. Once you get the signal, inside of the signal handler, set signal handler for the signal to SIG_DFL. Then send yourself same signal, using kill() system call. Here’s a short code snippet that demonstrates this little trick in action.

#include <stdio.h>
#include <sys/types.h>
#include <unistd.h>
#include <signal.h>

void sighandler(int signum)
{
    printf("Process %d got signal %d\n", getpid(), signum);
    signal(signum, SIG_DFL);
    kill(getpid(), signum);
}

int main()
{
    signal(SIGSEGV, sighandler);
    printf("Process %d waits for someone to send it SIGSEGV\n",
        getpid());
    sleep(1000);

    return 0;
}

Note that this code doesn’t actually cause a segmentation fault. To simulate segmentation fault, I did kill -11 <pid> from the command line. This is what happened.

$ ls
sigs.c
$ gcc sigs.c
$ ./a.out
Process 2149 waits for someone to send it SIGSEGV
Process 2149 got signal 11
Segmentation fault (core dumped)
$ ls
a.out*  core  sigs.c

Obviously, without lines 9 and 10 in the code, there would not be core file.

By the way, you can use this technique to handle any core generating exception – SIGILL, SIGFPE, etc.

Did you know that you can receive periodical updates with the latest articles that I write right into your email box? Alternatively, you subscribe to the RSS feed!

Want to know how? Check out
Subscribe page

26 Comments

  1. [...] case you still want to handle exception signals, read my How to handle SIGSEGV, but also generate a core dump [...]

  2. dam says:

    Thanks Alexander. Nice trick. I was exactly looking this..

  3. Gal says:

    I use’ed this code in a shell I wrote in order to use ctrl – c & ctrl v on process the shell runs.
    for some senarios its caused a segmentation fault.
    for example – a process that was suspended (on suspended Q) and than movind it to forground (+ remove from Q) and then the handler ==>caused seg_foult ==> do you know why? its happen also when I didnt removed it from the Q but only moved it from suspended to FG
    Thanks

  4. pradeep says:

    Hi Alexander,
    I am new to linux and trying to understand the signal .Your artcles are very intresting . I have tried the above program .But didn’t see the core dump file. This is what I have tried

    pradeepk@ipglx29> ./a.out
    Process 18123 waits for someone to send it SIGSEGV
    Process 18123 got signal 11
    Segmentation fault

    ls -l
    total 11
    -rwxrwxr-x 1 pradeepk users 9265 2010-09-20 12:15 a.out*
    -rw-rw-r– 1 pradeepk users 396 2010-09-20 12:15 sig.c
    ~/IPC [ NONE ]

    From another terminal I have tried kill -11 pid

    Thanks for the help.

    Regards
    Pradeep

  5. latanius says:

    Nice, works, thanks Alexander! (btw, without this, catching a (real, null-pointer-dereference) SIGSEGV results in calling the signal handler again and again, presumably because the control returns to the faulting address after the handler runs… so this thing seems to be quite useful.)

    @pradeep:
    core dumps aren’t created by default, you have to explicitly enable them, see for example this article:
    http://aplawrence.com/Linux/limit_core_files.html

  6. @pradeep and @latanius, it seems you guys have found each other.
    Oh btw, I often use “ulimit -c unlimited” – that’s because I usually have no idea what will be the size of the core file.
    Thanks to both of you for your notes and please come again :-)

  7. Chris Eleveld says:

    I see you are re-signallying the same signal. Since the original signal is not guaranteed to be one providing a core dump have you considered:
    gcore(), abort(), or { char *cp = 0; *cp = ’1′; }
    all of which seem to be fairly short and highly portable. Watch out for the last one if you have a SIGBUS or SIGSEGV handlers installed. Also you may need to uset system() or fork/exec to run the gcore utility on linux since gcore does not seem to be standard on linux yet.

  8. @Chris Eleveld
    I didn’t try any of these. What will happen if you open core file that you generated with *cp = ’1′; with gdb? I bet it will be the line in signal handler that generated the exception. If you do it the way I suggested, it will pin-point you to the place where exception took place and not place where you resend it from.

  9. Andreas says:

    Why reconfiguring the signal handler? SIG_DFL is just a function pointer. You can call SIG_DFL also directly something like this:

    void sighandler(int signum)
    {
    printf(“Process %d got signal %d\n”, getpid(), signum);
    SIG_DFL(signum);
    }

  10. @Andreas
    I believe you are right, but I’d still stick to the method I described. This is because type of SIG_DFL is implementation dependent. No-one promises that it is a pointer to a valid function.

  11. 10110111 says:

    You can call abort() from signal handler, it will also generate core dump, so no need in magic you described.

  12. Bart says:

    Originally Posted By 10110111You can call abort() from signal handler, it will also generate core dump, so no need in magic you described.

    Using abort() in your signal handler will raise signal SIGABRT.
    Yes, you get a core file…… of your signal handler, not the process that actually crashed.

  13. id says:

    how did u give it the signal 11. i execute the same way but never never got signal 11.

  14. Gabe Black says:

    @Andreas

    yes SIG_DFL’s type is a function pointer, however, if you look at what its value is it is set to 0. So obviously you couldn’t call SIG_DFL(signum). A signal is actually captured first by the kernel and it can be forwarded on to a custom signal handler if the signal handler is non-zero. Alexander has it right to consider SIG_DFL implementation dependent and not assume it is a pointer to a valid function (which the current implementation does not have it going to a valid function).

  15. @id
    Doesn’t kill -11 from command line works?

  16. Mansour says:

    If you setup a custom handler using sigaction(), and set sa_flags = SA_RESTART, then after reverting the handler to SIG_DFL, you won’t need to send yourself SIGSEGV as the second attempt to execute the offending instructions will cause a normal segfault.

    Not sure how this is more useful, just a thought.

  17. [...] creation of a core dump, or signal something that a core dump is about to occur using this trick: How to handle SIGSEGV, but also generate a core dump – Alex on Linux But don't call printf like in the example since that's not in the list. [...]

  18. RickS says:

    Hi Alexander:

    I am trying to debug a program that uses the exact same method of re triggering a SIGSEGV fault in the signal handler as you describe. However when trying to analyze the resulting core dump I cannot seem to get a useful backtrace to where the offending instruction occurred:

    (gdb) bt
    #0 0xb7b7d3b1 in kill () from /lib/libc.so.6
    #1 0xb7f8dd2d in CEventHandler::defaultMachineSignalHandler (signo=11) at ../source/Event.cpp:369
    #2 0xb7fc0420 in ?? ()
    #3 0x0000000b in ?? ()
    #4 0×00000033 in ?? ()
    #5 0×00000000 in ?? ()

    The stack contents between the initial SIGSEGV and the handler don’t allow me to see where the original fault occured. The program is embedded multi-threaded, uses shared objects and cross compiled from another system. The system it is running on is:
    uname -a
    Linux D400 2.6.18.2-ASAT #1 PREEMPT Wed Jul 21 11:40:47 MDT 2010 i686 i686 i386 GNU/Linux

    Do you know of a way to unwind the stack before the core dump is generated, so that I can get info on the original fault?

    Any help you could provide would be appreciated.

  19. Alex says:

    Hi Alex,

    I confirm the trouble reported by RickS.
    Your solution shows valid call stack for core generated on Ubuntu, but not on Red Hat Linux. In this case backtrace shows something unrelated to the real place which caused singnal to be sent.

    So probably this is not universal method

  20. engineer says:

    Hi alexander,
    I double confirm the trouble reported by RickS and Alex.

    When I run multi threaded program, the core file seems to point to a place completely unrelated to where the seg fault has occurred. Is there any work around for this??

    Say there are 3 thread m (main), t1, t2
    if segv happened in t2, I am seeing the core file to point to some unrelated code in m. I think by the time you catch the signal, do cleanup and generate core file, the stack is getting changed..

  21. @engineer
    @Alex
    @RickS
    I think there is a workaround. Instead of sending kill to getpid(), send it to gettid(). See here:
    http://www.alexonlinux.com/how-to-obtain-unique-thread-identifier-on-linux

    In Linux, when you send a signal to a process, it will be handled by arbitrary thread. So when you kill getpid(), any threads in the process handles it. So, instead of sending kill to getpid(), send it to the sick thread and you will get right backtrace.
    I’ll try to confirm that it works some time later.

  22. Libin says:

    nice one.. was searching for something like this..

  23. thomsy says:

    Why can’t we use SA_RESETHAND?

  24. Venu says:

    Why are we re-raising the signal after calling default handler. The default handler supposed to be generate core and terminate the application right?

    Here:
    http://www.cs.utah.edu/dept/old/texinfo/glibc-manual-0.02/library_21.html#SEC353

    There is no invocation of default handler, hence re-raised. But here I couldn’t get the reason.

Leave a Reply


seven − 6 =