How debugger works

Table of contents

IntroductionBACK TO TOC

In this article, I’d like to tell you how real debugger works. What happens under the hood and why it happens. We’ll even write our own small debugger and see it in action.

I will talk about Linux, although same principles apply to other operating systems. Also, we’ll talk about x86 architecture. This is because it is the most common architecture today. On the other hand, even if you’re working with other architecture, you will find this article useful because, again, same principles work everywhere.

Kernel supportBACK TO TOC

Actual debugging requires operating system kernel support and here’s why. Think about it. We’re living in a world where one process reading memory belonging to another process is a serious security vulnerability. Yet, when debugging a program, we would like to access a memory that is part of debugged process’s (debuggee) memory space, from debugger process. It is a bit of a problem, isn’t it? We could, of course, try somehow to use same memory space for both debugger and debuggee, but then what if debuggee itself creates processes. This really complicates things.

Debugger support has to be part of the operating system kernel. Kernel able to read and write memory that belongs to each and every process in the system. Furthermore, as long as process is not running, kernel can see value of its registers and debugger have to be able to know values of the debuggee registers. Otherwise it won’t be able to tell you where the debuggee has stopped (when we pressed CTRL-C in gdb for instance).

As we spoke about where debugger support starts we already mentioned several of the features that we need in order to have debugging support in operating system. We don’t want just any process to be able to debug other processes. Someone has to monitor debuggers and debuggees. Hence the debugger has to tell the kernel that it is going to debug certain process and kernel has to either permit or deny this request. Therefore, we need an ability to tell the kernel that certain process is a debugger and it is about to debug other process. Also we need an ability to query and set values from debuggee’s memory space. And we need an ability to query and set values of the debuggee’s registers, when it stops.

And operating system lets us to do all this. Each operating system does it in it’s manner of course. Linux provides single system call named ptrace() (defined in sys/ptrace.h), which allows to do all these operations and much more.

ptrace()BACK TO TOC

ptrace() accepts four arguments. First is one of the values from enum __ptrace_request that defined in sys/ptrace.h. This argument specifies what operation we would like to do, whether it is reading debuggee registers or altering values in its memory. Second argument specifies pid of the debuggee process. It’s not very obvious, but single process can debug several other processes. Thus we have to tell exactly what process we’re referring. Last two arguments are optional arguments for the call.

Starting to debugBACK TO TOC

One of the first things debuggers do to start debugging certain process is attaching to it or running it. There is a ptrace() operation for each one of these cases.

First called PTRACE_TRACEME, tells the kernel that calling process wants its parent to debug itself. I.e. me calling ptrace( PTRACE_TRACEME ) means I want my dad to debug me. This comes handy when you want debugger process to spawn the debuggee. In this case you do fork() creating a new process, then ptrace( PTRACE_TRACEME ) and then you call exec() or execve().

Second operation called PTRACE_ATTACH. It tells the kernel that calling process should become debugging parent of the process being called. Debugging parent means debugger and a parent process.

Debugger-debuggee synchronizationBACK TO TOC

Alright. Now we told operating system that we are going to debug certain process. Operating system made it our child process. Good. This is a great time for us to have the debuggee stopped and us doing preparations before we actually start to debug. We may want to, for instance, analyze executable that we run and place a breakpoints before we actually start debugging. So, how do we stop the debuggee and let debugger do its thing?

Operating system does that for us using signals. Actually, operating system notifies us, the debugger, about all kinds of events that occur in debuggee and it does all that with signals. This includes the “debuggee is ready to shoot” signal. In particular, if we attach to existing process it receives SIGSTOP and we receive SIGCHLD once it actually stops. If we spawn a new process and it did ptrace( PTRACE_TRACEME ) it will receive SIGTRAP signal once it attempts to exec() or execve(). We will be notified with SIGCHLD about this, of course.

A new debugger was bornBACK TO TOC

Now lets see code that actually demonstrates that. Complete listing can be found here.

The debuggee does the following…

.
.
.
    if (ptrace( PTRACE_TRACEME, 0, NULL, NULL ))
    {
        perror( "ptrace" );
        return;
    } 

    execve( "/bin/ls", argv, envp );
.
.
.

Note the ptrace( PTRACE_TRACEME ) followed by execve(). This is what real debuggers do to spawn the process that going to be debugged. As you know, execve() replaces current executable image and memory of the current process with the executable and memory space belonging to program that being execve()‘d. Once kernel finishes this operation, it sends SIGTRAP to calling process and SIGCHLD to the debugger. The debugger receives appropriate notifications via signals and via wait() that returns. Here is the debugger’s code.

.
.
.
    do {
        child = wait( &status );
        printf( "Debugger exited wait()\n" );
        if (WIFSTOPPED( status ))
        {
            printf( "Child has stopped due to signal %d\n",
                WSTOPSIG( status ) );
        }
        if (WIFSIGNALED( status ))
        {
            printf( "Child %ld received signal %d\n",
                    (long)child,
                    WTERMSIG(status) );
        }
    } while (!WIFEXITED( status ));
.
.
.

Compiling and running listing1.c produces following output:

In debuggee process 14095
In debugger process 14094
Process 14094 received signal 17
Debugger exited wait()
Child has stopped due to signal 5

Here we can clearly see that debugger indeed receives a signal and gets notified via wait(). If we want to place a breakpoint before we start to debug the process, this is our chance. Lets talk about how we can do something like that.

The magic behind INT 3BACK TO TOC

It is time to dig a bit into subject that is not adored by most of the programmers and that is assembler language. I am afraid we don’t have much choice because breakpoints work on assembler level.

We have to understand that each our compiled program is actually a set of instructions that tells CPU what to do. Some of our C expressions translated into single instruction, while others may be translated into hundreds and even thousands of instructions. Instruction may be bigger or smaller. From 1 byte up to 15 bytes long for modern CPUs (Intel x86_64).

Debuggers mostly operate on CPU instruction level. The matter of fact that gdb understands C/C++ code and allows you to place breakpoints at certain C/C++ line is only an enhancement over gdb‘s basic ability to place breakpoints on certain instruction.

There are several ways to place breakpoints. The most widely used is the INT 3 instruction. It is a single byte operation code instruction that once reached by CPU, tells it to call special breakpoint interrupt handler, provided by operating system during its initialization. Since INT 3 instruction operation code is so small, we can safely substitute any instruction with it. Once operating system’s interrupt handler called, it figures what process reached a breakpoint and notifies it and its debugging process via signals.

Breakpoints hands onBACK TO TOC

Lets return to our debuggee/debugger friends. As we mentioned debugger does have a chance to place a breakpoint before letting the debuggee process to run. Lets see how this can be done.

Breakpoints placed with INT 3 instruction. Before writing the actual 0xcc (INT 3 operation code), we should figure where to place the instruction. For purpose of this article we will do it manually. On the contrary, real debuggers include complex logic that calculates where and when to place the breakpoints. gdb places several breakpoints by itself, without you even knowing about it. And obviously it has functionality that places breakpoints once you ask it to do so.

In our previous example we had our debuggee process executing ls. It is not suitable for our next demonstration. We will need a sample program that would let us easily demonstrate breakpoints in action. Here it is.

#include <stdio.h>

int main()
{
        printf( "~~~~~~~~~~~~> Before breakpoint\n" );
        // The breakpoint
        printf( "~~~~~~~~~~~~> After breakpoint\n" );

        return 0;
}

And here is the disassembler output of the main() routine.

0000000000400508 <main>:
  400508:       55                      push   %rbp
  400509:       48 89 e5                mov    %rsp,%rbp
  40050c:       bf 18 06 40 00          mov    $0x400618,%edi
  400511:       e8 12 ff ff ff          callq  400428 <puts@plt>
  400516:       bf 2a 06 40 00          mov    $0x40062a,%edi
  40051b:       e8 08 ff ff ff          callq  400428 <puts@plt>
  400520:       b8 00 00 00 00          mov    $0x0,%eax
  400525:       c9                      leaveq
  400526:       c3                      retq

We can see that if we will place a breakpoint at address 0x400516, we will see a printout before reaching the breakpoint and right after reaching it. For the sake of our demonstration, we will place a breakpoint at this address. Once we will reach the breakpoint, we will sleep and then let the debuggee running. We should see debuggee producing first printout, then sleeping for a few seconds and then producing second printout.

We’ll achieve our goal in several steps.

  1. First of all, we should fork() off the debuggee. We already did something similar.
  2. Next step is to intercept the execve() call in debuggee. Been there, done that.
  3. Here’s something new. We should modify a byte at address 0x400516 from 0xbf to 0xcc, saving original value (0xbf). This is how we place the breakpoint.
  4. Next, we’re going to wait() for the process. Once it will reach the breakpoint, we’ll be notified.
  5. Once the debuggee reaches the breakpoint we want to restore the code we broke with our 0xcc to its original state.
  6. In addition, we want to fix value of RIP register. This register tells CPU what is the location in memory of next meaningful instruction for it to execute. It’s value will be 0x400517, one byte after 0xcc that we placed. We want to set the RIP register to 0x400516 value because we don’t want the CPU to skip over that MOV instruction that we broke with our 0xcc.
  7. Finally, we want to wait five seconds for the sake of demonstration and let the debuggee continue running.

First things first. Lets see how we do step 3.

.
.
.
        addr = 0x400516;

        data = ptrace( PTRACE_PEEKTEXT, child, (void *)addr, NULL );
        orig_data = data;
        data = (data & ~0xff) | 0xcc;
        ptrace( PTRACE_POKETEXT, child, (void *)addr, data );
.
.
.

Again, we can see how ptrace() does the job for us. First we peek 8 (sizeof( long )) bytes from address 0x400516. On some architectures this could cause lots of headache because of unaligned memory access. Luckily, we’re on x86_64 and unaligned memory accesses are permitted. Next we set the lowest byte  to be 0xcc – INT 3 instruction. Finally, we place 8 bytes back to their place.

We’ve seen how we can wait for certain event in debuggee. Also, we now know how to restore the original value at address 0x400516. So we can skip over steps 4-5 and jump right into step 6. This is something that we haven’t done so far.

What we have to do is to read debuggee registers, change them and write them back. Again ptrace() does all the job for us.

.
.
.
        struct user_regs_struct regs;
.
.
.
        ptrace( PTRACE_GETREGS, child, NULL, &regs );
        regs.rip = addr;
        ptrace( PTRACE_SETREGS, child, NULL, &regs );
.
.
.

Things are not too well documented here. For instance ptrace() documentation never mentions struct user_regs_struct, however this is what ptrace() system call expects to receive in kernel. Once we know what we should use as ptrace() arguments, it is easy. We use PTRACE_GETREGS operation to obtain values of debuggee’s registers, we modify the RIP register and write them back with PTRACE_SETREGS operation. Clear and simple.

Lets see how things actually work. You can find complete listing of debugger process here. Compiling and running listing2.c, produces following output.

In debuggee process 29843
In debugger process 29842
Process 29842 received signal 17
~~~~~~~~~~~~> Before breakpoint
Process 29842 received signal 17
RIP before resuming child is 400517
Time before debugger falling asleep: 1206346035
Time after debugger falling asleep: 1206346040. Resuming debuggee...
~~~~~~~~~~~~> After breakpoint
Process 29842 received signal 17
Debuggee exited...
Debugger exiting...

You can see that “Before breakpoint” printout appears 5 seconds before “After breakpoint” printout. The “RIP before resuming child is 400517” clearly indicates that the debuggee has stopped on address 0x400517, as we expected.

Single stepsBACK TO TOC

After seeing how easy to place a breakpoint, you can guess that stepping over one line of C/C++ code is simply a matter of placing a breakpoint on the next line of code. This is exactly what gdb does when you want it to single step over some expression.

ConclusionBACK TO TOC

Debuggers and how they work often associated with some kind of magic.

Debuggers, and gdb as an example, are exceptionally complicated piece of software. Placing breakpoints and single stepping is only a small fraction of what it is able to do. gdb in particular works on dozens of hardware architectures. It supports remote debugging. It is perhaps the most advanced and complicated executable analyzer. It knows when a program loads dynamic library and analyzes the code of that library automatically. It supports bunch of programming languages – from C/C++ to ADA. And these are just few out of its features.

On the contrary, we’ve seen how easy to start debugging certain process, place a breakpoint, etc. The basic functionality that allows debugging is in the operating system and in the CPU, waiting for us to use it.

Did you know that you can receive periodical updates with the latest articles that I write right into your email box? Alternatively, you subscribe to the RSS feed!

Want to know how? Check out
Subscribe page

83 Comments

  1. Nishant says:

    Hi,

    This is very good document for understanding GDB…

    If you have more details about this please sent it to me

    Thanks

  2. Alexander Sandler says:

    I don’t think I am going to write anything on the subject any time soon, but if you have questions, don’t hesitate to ask – either here or via alexander.sandler@gmail.com

  3. Prashant says:

    Hi,
    This is very good document.
    Do you have more details on “How CPU (hardware) works while debugging?”

  4. Dipak Dudhabhate says:

    HI,

    The program listing.c is not working correctly on Fedora 8.
    It has gcc 4.3 and kernel 2.6.24.7-92.fc8. What is the issue with Fedora 8?

  5. Alexander Sandler says:

    Please give me few days to check this out.

  6. Eugene Hermann says:

    Thank You for a teaching. It’s a very good explanation.
    The example requires some modification for my target platform, but it was very useful for understanding.
    First I’d replaced
    #include
    with direct include
    #include

    and then I found a real entry point to sleeper (it was 0x80483a4 instead of 0x400516).
    May be Dipak also need to explore his sleeper with objdump ?

    Can You provide a tutorial about how to implement watchpoint on memory writing?
    My finish goal is not to write the own debugger, but to make a some self-debugging code with memory usage monitor. I think it’ll be useful in many ways for developing.

  7. Alexander Sandler says:

    Eugene, thanks for commenting. I think you’re right. Dipak’s problem most likely is that hard-coded address. Anyway, I explained him everything via email.
    As for your request – actually it might be a good idea for an article. I probably will write something about the subject, but it will take time. I’ll let you know once I’ll have something ready :-)

  8. rakesh says:

    it is good .I wanted to learn more how debuggers works through JTAG interface .

  9. kamal says:

    This is really a very good article.This is the first document i saw on debugger internals.

    can you please give more detailed info on same.?

  10. @kamal
    Thanks for a warm comment and for visiting.
    I apologize, but these days I barely have time to breathe :-) But I promise to keep writing on the subject :-)

  11. Sanjay says:

    Hi Alexander: It is indeed a very good article. I could get the idea on the topic very well. I am waiting for more info/article on the same from you. :) ;)
    thanks
    -Sanjay

  12. Daniil says:

    Hi.

    If we set break point in the shared library, why do other processes that use that library doesn’t stop?

    Who debugger or kernel handle that?
    And What mechanism in that case?

    Thanks.

  13. @Daniil
    Kernel handles that. I couldn’t find where exactly in the kernel code it happens – should be somewhere in handle_mm_fault() in mm/memory.c. Anyway here how it works in theory.
    Shared library code being write protected. When someone tries to write it, kernel receives a fault (handled by handle_mm_fault()). Once kernel receives such fault, it allocates a new memory page, copies the data from old page to new one and replaces old page with new page. Then kernel modifies the newly allocated page as debugger has requested.
    Hope this answers your question :-)

  14. kamal says:

    Alexander

    Can you plz explain the diff between open/read and fopen/fread other the that “f” func are not sys calls and give formatted output.

    And Do you have any article on trees ?

    Thanks in advance.

  15. kamal says:

    Originally Posted By kamalAlexander

    Can you plz explain the diff between open/read and fopen/fread other than that “f” func are not sys calls and give formatted output.
    and if “f” function internally calls open/read sys call then why we dont directly call them ,if we dont want formatted output?

    And Do you have any article on trees and brk/sbrk ?

    Thanks in advance.

  16. naveen says:

    hi alexander..
    it is a very good article . thanks for sharing it.
    i have been searching for how exactly a debugger works but now have understood it quite a bit.
    i just wanted to know one thing..
    is INT 3 a software interrupt or hardware.

  17. @naveen
    I am glad I could help. Please visit again :-)
    As for your question, int 3 is a hardware interrupt in a sense that CPU itself calls this interrupt, not any other code in kernel.

  18. Stepping over a line of c/c++ code is not as simple as
    placing breakpoint on address of next line of source code.

    let’s have:

    [100]
    [100] if (condition)
    [101] {
    [102] do_thing(1);
    [103] }
    [104] else
    [105] {
    [106] do_thing(2);
    [107] }

    It is not enough to place breakpoint in [102],
    we should place it in [106] as well.

    Any ideas how to handle it without disassembler?
    Regards,
    MS

  19. @Marcin Sokalski
    Unfortunately you are right. It is one of those things that debugger does for you. It is not very complicated on assembler level – debugger has to understand branch instructions and place a breakpoint on both occasions. To do same thing without disassembling, you will have to analyze C/C++ code. Analyzing dozen or so variations of JXX instruction is one thing. Analyzing nearly endless number of variations of C/C++ if statements is much more complicated.

  20. Salil says:

    A great fresher course on Debugger…I appreciate it

  21. @Salil
    I am happy to hear that you find it useful. Please visit my web-site again :-)

  22. […] on signals to receive events about programs that being debugged (read more about this in my article How Debugger Works). Signals is one of so called IPC – Inter Process Communication mechanisms. IPC used to, as the […]

  23. technochakra says:

    Debugging articles are not very common in the blog world. I often write about debugging (though I try to stay platform neutral). My latest article on software breakpoints digs deep into int3 and 0xcc. You might want to take a look at //www.technochakra.com/software-breakpoints/. Hope you find it interesting.

  24. @technochakra
    It is very interesting indeed. I bookmarked your blog and will put it on my blogroll. Thanks for sharing it.
    I wish you were writing more frequently though.

  25. Andre Goddard Rosa says:

    This is another great article, as all others. I had only theoretical background on this subject(hardware and software breakpoints, specially sw one). It was nice to read all this from your website, you always reply people’s questions before they ask, by questioning yourself first during your own understanding and then putting the answer of all curiosities on your articles.

    Just a nitpick… I noticed that on listing1.c it would be safer to install the signal handler before fork()’ing to the child, because the parent could receive the signal before installing it. Anyway, for the purpose of your example, it does not matter. :)

    Thank you!

  26. Andre Goddard Rosa says:

    In a tangent, here is a related article talking about utrace, which will be probably ptrace future in the kernel:

    http://lwn.net/Articles/224772

    It’s being implemented by Roland McGrath(from glibc) on behalf of Redhat. Probably ptrace() will be a wrapper around utrace functionality in the future. Hope you like the article!

  27. @Andre Goddard Rosa
    Thanks for the compliments and for the link. I’ll take a look at it.

  28. vineet says:

    Dear Alex, do u have something on
    1. watchpoints or conditional breakpoints?
    2. until (as given by gdb)
    m creating own ptrace based debugger.
    pls rply immed

  29. Karan Verma says:

    Hello Alex

    Thanks for the tutorial, it has been a great starting point for me. I too am planning to write a debugger for myself.

    Can you give more insights into how actually we can print values of arrays and structures in the debuggie and change their values.

    Also I want to delve deeper into working of gdb and implement some of the complex logic that gdb uses. Can you please refer me to a good learning resource for that.

    Thanks

    Karan

  30. Originally Posted By vineet
    Dear Alex, do u have something on
    1. watchpoints or conditional breakpoints?
    2. until (as given by gdb)
    m creating own ptrace based debugger.
    pls rply immed

    1. I have nothing on watchpoints or conditional breakpoints.
    2. I have no idea what you mean.
    Goodluck.

  31. Originally Posted By Karan Verma
    Hello Alex

    Thanks for the tutorial, it has been a great starting point for me. I too am planning to write a debugger for myself.

    Can you give more insights into how actually we can print values of arrays and structures in the debuggie and change their values.

    Also I want to delve deeper into working of gdb and implement some of the complex logic that gdb uses. Can you please refer me to a good learning resource for that.

    Thanks

    Karan

    Karan,

    There’s a ptrace() operation that does this (see man page). What seems to be more difficult is to know the size and contents of the data structure that you want to read. To do that you will have to teach your debugger to analyze executable files and in particular the debug information in executable files.

    Alex.

  32. […] Interesting post on how debuggers work. […]

  33. Kesava Chandra says:

    Excellent article. I will recommend this page to my friends

  34. @Kesava Chandra
    Thanks :-) But don’t forget to come along yourself :D

  35. Sreehari says:

    Excellent articles, Please keep writing.

  36. Jiang says:

    Nice. Thanks for writing this great article.

  37. raptomania says:

    Can i trace child threads using ptrace ?

  38. tiryaki says:

    Very nice article, thanks Alex.

  39. Sunil says:

    Its an excellent writeup .. Thanks Alex

  40. Chandru says:

    Very nice document to read. Thank you very much for a detailed description.

    Thanks
    Chandru

  41. Andrew says:

    Excellent article!
    I have a problem about debugger…
    Nowdays,I work on Qtwebkit of WINCE. And I have a very problem.
    If I run my application of release version with Qtwebkit through VS2005, everything works well.But when I run it directly through double click,it works more slowly.
    That’s unbelievable but it is true! Maybe the configuration for my compiler is unright.
    Can I run the debugger directly with my application without VS2005 then my application can works well?
    Any help is appreciated.Pls contact me through my Email.Thanks very much.

  42. Rick C. Hodgin says:

    Very good article. The x86 architecture provides a single step interrupt as well, which can be employed in hardware. I do not know how to activate this in Linux, but in the Task State Segment, if the trap debug bit is set, it will execute one instruction and then signal the interrupt to the kernel, which would trap back to the debugger.

    You can find this on Intel’s IA-32 manual on page 283 here:
    http://www.intel.com/Assets/ja_JP/PDF/manual/253668.pdf

    Also a bit in EFLAGS can be set to enable the single-step trap as well. See page 686 of the same manual for information on the TF (Trap Flag) bit, which signals the CPU to execute only one instruction, then return to the debugger. This allows for single-stepping even on branch instructions without tracing all paths.

  43. Yusuf says:

    Hi Alex,

    Really nice article, even I got inspired by you and started writing from past three days :-)

    Please have a look and give your comments, so that I can also improvise and share the knowledge in better way.

    http://yusufonlinux.blogspot.com/2010/11/raw-socket-in-linux.html

    Thanks,
    Yusuf

  44. @Rick C. Hodgin
    Indeed it is there. I believe gdb uses it when you do istep and inext.

  45. Yogesh Chandolia says:

    hi all!!!
    i am going to build a debugger..
    bt i am nt able to put break point on the debuggie process.
    if u hv some idea how to put break point on process dn
    pls tell me or send me mail on the following id
    i m waiting for ur reply..
    thanx
    yogesh.chandolia029@gmail.com

  46. @Yogesh Chandolia
    Why don’t you start with reading this article. And then we will see.

  47. […] How debugger works * Understanding ELF using readelf and objdump * Implementing breakpoints on x86 Linux * NASM manual […]

Leave a Reply

Prove you are not a computer or die *