#804 closed defect (fixed)
Tracing a task causes it to crash
Reported by: | Jiri Svoboda | Owned by: | |
---|---|---|---|
Priority: | major | Milestone: | 0.11.1 |
Component: | helenos/app/trace | Version: | mainline |
Keywords: | udebug | Cc: | |
Blocker for: | Depends on: | ||
See also: |
Description
Running /app/trace +s <command>
causes that command to crash (usually with something like a null pointer dereference), after that the tracer waits indefinitely (until you press Ctrl-Q).
It looks like this is a regression introduced between release 0.4.1 and 0.4.2.
Change History (9)
comment:1 by , 5 years ago
comment:2 by , 5 years ago
An example with /app/tester (disabled shared libraries):
Task /app/tester (74) killed due to an exception at program counter 0x00000000004073a3. cs =0x0000000000000023 rip=0x00000000004073a3 rfl=0x0000000000210246 err=0x0000000000000004 ss =0x000000000000001b rax=0x0000000000000000 rbx=0x0000000070018620 rcx=0x00000000004895e0 rdx=0x0000000000000000 rsi=0x000000000042085f rdi=0x0000000000000000 rbp=0x0000000070141d80 rsp=0x0000000070141d50 r8 =0x0000000000000000 r9 =0x0000000000000000 r10=0x0000000000000000 r11=0x0000000000200216 r12=0x0000000000000000 r13=0x00000000700405a0 r14=0x0000000070141d98 r15=0x0000000000000001 0x0000000070141d80: 0x00000000004073a3() 0x0000000070141dc0: 0x0000000000417ab8() 0x0000000070141df0: 0x00000000004062e3() 0x0000000070141e10: 0x0000000000406163() 0x0000000070141e20: 0x00000000004060ca() Kill message: Page fault: 0x0000000000000000. [/srv/taskmon(16)] taskmon: Task 74 fault in thread 0xffffffff85dab3d0.
The stack trace translates as:
0x0000000070141d80: 0x00000000004073a3() str_size+3 0x0000000070141dc0: 0x0000000000417ab8() vfs_cwd_set 0x0000000070141df0: 0x00000000004062e3() __libc_main ...
I got the same stack trace with another binary. Here's the disassembly of str_size:
00000000004073a0 <str_size>: 4073a0: 55 push %rbp 4073a1: 31 c0 xor %eax,%eax 4073a3: 80 3f 00 cmpb $0x0,(%rdi)
Here %rdi
is zero, hence the fault.
comment:3 by , 5 years ago
I suspected the problem could be arch-specific, but confirmed this, apart from amd64, on ia32, arm32 and ppc32.
comment:4 by , 5 years ago
The root cause: {/app/trace} does not use task_spawnxxx
to launch the command it is passed, it has a function preload_task
that mimics task_spawnvf
, but it got out of sync over time.
The only difference should be that it does not actually start the program at the end, giving a chance to connect the debugger to it.
Ideally we'd find a way to deduplicate the code (or at least move it close together) to prevent this from happening again in the future.
Also not good that this went unnoticed for 9 years We should add a test for this.
comment:5 by , 5 years ago
Component: | helenos/kernel/generic → helenos/app/trace |
---|
comment:6 by , 5 years ago
Keywords: | udebug added |
---|
comment:7 by , 5 years ago
Resolution: | → fixed |
---|---|
Status: | new → closed |
Fixed in changeset 2443ad8f535160d516359619d12e8e3971670065.
comment:8 by , 5 years ago
Milestone: | → 0.9.2 |
---|
If I kill
taskmon
beforehand, the task still faults, buttrace
will at least exit afterwards.