Version 3 (modified by 8 years ago) ( diff ) | ,
---|
Using QEMU and GDB to debug kernel and uspace tasks
Some of the debugging techniques and procedures described in this article are illustrated on an example involving the ia32 architecture. Other architectures should behave in a similar or analogous way, but dealing with potential differences is left to the reader as an exercise.
The combination of QEMU and GDB allows HelenOS to be comfortably debugged either on the assembly or the source code level. For detailed information on low-level debugging, see for example this course on crash dump analysis.
Preparing the build
The default HelenOS build should produce unstripped binaries. If necessary, this can be enforced by making sure the Strip binaries
build configuration option is not checked. Unstripped binaries come with symbols, but do not contain any fancier debugging information. In order to get maximum out of the debug build, make sure to configure HelenOS with the Line debugging information
option. Another thing which may impede debugging is optimization, so consider changing optimization levels to 0 using the OPTIMIZATION
variable in the respective Makefiles (mainline/kernel/Makefile or mainline/uspace/Makefile.common). When everything is set, make sure to rebuild with the new settings.
Starting QEMU
QEMU provides two command line options for debugging with GDB: -s
and -S
. The former instructs QEMU to listen for GDB connections on localhost:1234 (but does not wait for it) and the latter stops the guest CPU at startup so that debugging is possible from the very beginning. When starting emulation using the mainline/tools/ew.py script, one can add these options like this:
$ `tools/ew.py -d` -s -S
Connecting GDB to QEMU
Once QEMU is started with the -s
(and optionally also the -S
) option, it is possible to connect GDB to it. For our purposes, we will assume the respective cross-GDB built by the mainline/tools/toolchain.sh is ready to be used and named as /usr/local/cross/ia32/bin/i686-pc-linux-gnu-gdb
. Needless to say, the cross-GDB should always match the architecture of the HelenOS guest.
$ /usr/local/cross/ia32/bin/i686-pc-linux-gnu-gdb GNU gdb (GDB) 7.11 Copyright (C) 2016 Free Software Foundation, Inc. License GPLv3+: GNU GPL version 3 or later <http://gnu.org/licenses/gpl.html> This is free software: you are free to change and redistribute it. There is NO WARRANTY, to the extent permitted by law. Type "show copying" and "show warranty" for details. This GDB was configured as "--host=x86_64-pc-linux-gnu --target=i686-pc-linux-gnu". Type "show configuration" for configuration details. For bug reporting instructions, please see: <http://www.gnu.org/software/gdb/bugs/>. Find the GDB manual and other documentation resources online at: <http://www.gnu.org/software/gdb/documentation/>. For help, type "help". Type "apropos word" to search for commands related to "word". (gdb)
Connect to QEMU using the following command at the (gdb)
prompt:
(gdb) target remote :1234 Remote debugging using :1234 0x0000fff0 in ?? ()
Note that if QEMU was not started with the -S
option, you will have to manually break into the debugger by pressing Ctrl-C
.
Loading symbols
Depending on what is the subject of our debugging session, we will need some symbols. The easiest case is kernel debugging. In that case, we simply load the kernel symbols from kernel.raw
:
(gdb) symbol-file kernel/kernel.raw Reading symbols from kernel/kernel.raw...done.
Debugging userspace tasks is a little more complicated because unlike the kernel, there will be multiple user processes running at the same time, so setting a mere breakpoint on a user address will probably not do. We will need to get some assistance from the kernel.
Setting and hitting breakpoints
To set a breakpoint, for example on the kernel function that handles invalid memory accesses from userspace, we type the following command:
(gdb) break fault_from_uspace_core Breakpoint 1 at 0x801237ff: file generic/src/interrupt/interrupt.c, line 169.
Sadly the set breakpoint is not always hit. At this point, the cause of this behavior is unknown.
To continue the emulation, tell GDB to continue:
(gbd) c Continuing.
When our breakpoint is later hit, for example as a result of executing inside HelenOS:
# tester fault1
we will break into the debugger prompt again:
Breakpoint 1, fault_from_uspace_core (istate=0x86d47fb4, fmt=fmt@entry=0x8016f3ce "Page fault: %p.", args=args@entry=0x86d47ec4 "\004") at generic/src/interrupt/interrupt.c:169 169 { (gdb)
Note that at any time while the guest is running, you can break into the debugger also by pressing the Ctrl-C
combo.
Now that a breakpoint in the kernel was hit, we can inspect the state of the kernel a little bit. Typing bt
will show us the stack trace of the current kernel thread:
(gdb) bt #0 fault_from_uspace_core (istate=0x86d47fb4, fmt=fmt@entry=0x8016f3ce "Page fault: %p.", args=args@entry=0x86d47ec4 "\004") at generic/src/interrupt/interrupt.c:169 #1 0x80123e2a in fault_if_from_uspace (istate=istate@entry=0x86d47fb4, fmt=fmt@entry=0x8016f3ce "Page fault: %p.") at generic/src/interrupt/interrupt.c:206 #2 0x80130bc9 in as_page_fault (address=4, access=PF_ACCESS_WRITE, istate=istate@entry=0x86d47fb4) at generic/src/mm/as.c:1497 #3 0x8010f978 in page_fault (n=14, istate=0x86d47fb4) at arch/ia32/src/mm/page.c:99 #4 0x80123bda in exc_dispatch (n=14, istate=0x86d47fb4) at generic/src/interrupt/interrupt.c:131 #5 0x8010ab62 in int_14 () at arch/ia32/src/asm.S:437 #6 0x0000000e in ?? ()
To see the information about the interrupted context, in this case the user context as it existed when the page fault exception occurred, we can print the istate
structure:
(gdb) set radix 16 Input and output radices now set to decimal 16, hex 10, octal 20. (gdb) p *istate $2 = {edx = 0x80, ecx = 0x7, ebx = 0x7013cee0, esi = 0x70037f98, edi = 0x7001b784, ebp = 0x7013ce78, eax = 0x4, ebp_frame = 0x0, eip_frame = 0x3da7, gs = 0x30, fs = 0x23, es = 0x23, ds = 0x23, error_word = 0x6, eip = 0x3da7, cs = 0x1b, eflags = 0x10202, esp = 0x7013ce78, ss = 0x23}
The printed eip
member is the program counter of the instruction which caused the exception. We will remember this value along with the value of esp
and ebp
for later.
Useful macros
When debugging the kernel, it is sometimes useful to find out some information about the current process. For that, we will need to mimic the computation of the address of the THE
structure. We will start by defining a macro:
(gdb) macro define the ((the_t *) ((uintptr_t )$esp & ~0x1fff))
From this time on, we can do things like:
(gdb) p the->task->name $3 = "tester\000it\000\000\000\000\000\000\000\000\000\000"
Note that this will work only when $esp
corresponds to the kernel stack.
Switching to the user context
Let us assume that we have somehow found the values of the userspace registers EIP, ESP and EBP (for example by inspecting the istate
structure as shown above) and know the name of the process (for example tester
from the example above). Before we can add symbol information for this process, we need to find out the load address of its .text
section (for some reason, GDB does not use the information provided in the ELF file):
$ objdump -h uspace/app/tester/tester uspace/app/tester/tester: file format elf32-i386 Sections: Idx Name Size VMA LMA File off Algn 0 .init 0000002e 000010b4 000010b4 000000b4 2**0 CONTENTS, ALLOC, LOAD, READONLY, CODE 1 .text 0002f5a0 000010e8 000010e8 000000e8 2**3 CONTENTS, ALLOC, LOAD, READONLY, CODE 2 .data 00001b08 000316a0 000316a0 0002f6a0 2**5 CONTENTS, ALLOC, LOAD, DATA 3 .tbss 00000048 000331a8 000331a8 000311a8 2**2 ALLOC, THREAD_LOCAL 4 .bss 00000184 000331c0 000331c0 000311a8 2**5 ALLOC 5 .comment 00000011 00000000 00000000 000311a8 2**0 CONTENTS, READONLY 6 .debug_abbrev 0000878f 00000000 00000000 000311b9 2**0 CONTENTS, READONLY, DEBUGGING 7 .debug_aranges 00000db8 00000000 00000000 00039948 2**3 CONTENTS, READONLY, DEBUGGING 8 .debug_info 000293a5 00000000 00000000 0003a700 2**0 CONTENTS, READONLY, DEBUGGING 9 .debug_line 0000cfc7 00000000 00000000 00063aa5 2**0 CONTENTS, READONLY, DEBUGGING 10 .debug_ranges 00000358 00000000 00000000 00070a6c 2**0 CONTENTS, READONLY, DEBUGGING 11 .debug_str 000078fc 00000000 00000000 00070dc4 2**0 CONTENTS, READONLY, DEBUGGING
So the .text
section gets loaded at 0x10e8 in this case. We will use this address to load our userspace symbols:
(gdb) add-symbol-file uspace/app/tester/tester 0x000010e8 add symbol table from file "uspace/app/tester/tester" at .text_addr = 0x10e8 (y or n) y Reading symbols from uspace/app/tester/tester...done.
The last step before we can print out our userspace stack trace is restoring registers to their userspace contents (as for example captured in the istate
structure):
(gdb) set $eip=0x3da7 (gdb) set $esp=0x7013ce78 (gdb) set $ebp=0x7013ce78
We are now ready to do some userspace debugging:
(gdb) bt #0 0x00003da7 in test_fault1 () at fault/fault1.c:34 #1 0x000010f6 in run_test (test=0x31770 <tests+208>) at tester.c:84 #2 0x00001374 in main (argc=0x43060, argv=0x0) at tester.c:161 #3 0x000134e3 in __main (pcb_ptr=0x7001b784) at generic/libc.c:121 #4 0x000010e2 in __entry () at arch/ia32/src/entry.S:69 #5 0x7001b784 in ?? ()
(gdb) disassemble Dump of assembler code for function test_fault1: 0x00003d9f <+0>: push %ebp 0x00003da0 <+1>: mov %esp,%ebp 0x00003da2 <+3>: mov $0x4,%eax => 0x00003da7 <+8>: movl $0x0,(%eax) 0x00003dad <+14>: mov $0x2d973,%eax 0x00003db2 <+19>: pop %ebp 0x00003db3 <+20>: ret End of assembler dump.