Opened 15 years ago
Closed 15 years ago
#227 closed defect (fixed)
HelenOS sometimes hard hangs shortly after boot
Reported by: | Jakub Jermář | Owned by: | Jakub Jermář |
---|---|---|---|
Priority: | critical | Milestone: | 0.4.3 |
Component: | helenos/kernel/ia32 | Version: | mainline |
Keywords: | Cc: | ||
Blocker for: | Depends on: | ||
See also: |
Description
This is reproducible for instance in revision 402. The system very rarely hangs after most of or all vc console icons turn red. The system looks entirely hung, no error message is displayed even by the kernel (I modified the kconsole printing code to ignore the silent
variable). I think this happened to me several times on a UP system, 4-processor and 7-processor systems.
Attachments (2)
Change History (14)
comment:1 by , 15 years ago
Component: | unspecified → kernel/ia32 |
---|
comment:2 by , 15 years ago
comment:3 by , 15 years ago
In the following analysis, I am using:
- QEMU PC emulator version 0.12.3, Copyright © 2003-2008 Fabrice Bellard
- C native compiler for building QEMU gcc version 4.4.1 (Ubuntu 4.4.1-4ubuntu9)
- C cross compiler for building HelenOS gcc version 4.4.3 (GCC) (built by toolchain.sh)
- HelenOS, ia32 defaults, head,418
The command to run HelenOS:
- qemu -cdrom image.iso -smp 7 -d int,cpu_reset
After about ten invocations, I got the hang. Note that I got similar hangs (also caused by a triple fault), in a more reproducible way, by running top and then several instances of tasks -c until the system hung. Also note that I reproduced the same hang (likely) also on HelenOS 0.4.1, which should rule out some recent changes.
I am attaching the relevant portion of qemu.log from the point were things seems to go wrong until we hit the triple fault. I am also going to attach the respective kernel.raw from were symbols, addresses and instructions can be recovered by e.g. objdump.
by , 15 years ago
Attachment: | kernel.raw.bz2 added |
---|
Kernel image with symbols matching the attached QEMU trace.
comment:4 by , 15 years ago
From the attached qemu.log, these are the interesting exceptions restricted to the suspicious address space CR3=0x65000:
- 166734
- the processor is executing scheduler_separated_stack() (EIP=0x8011df75) and receives IRQ 0 (v=0x20), i.e. the clock interrupt. More precisely the place location in scheduler_separated_stack() corresponds to the piece of find_best_thread() where the scheduler halts the processor and waits for an interrupt to continue.
166734: v=20 e=0000 i=0 cpl=0 IP=0008:8011df75 pc=8011df75 SP=0010:8001bf8c EAX=00200046 EAX=00200046 EBX=fffff000 ECX=00000000 EDX=00000001 ESI=00000000 EDI=8005f540 EBP=8001bff4 ESP=8001bf8c EIP=8011df75 EFL=00200202 [-------] CPL=0 II=0 A20=1 SMM=0 HLT=0 ES =0010 00000000 ffffffff 00cf9300 DPL=0 DS [-WA] CS =0008 00000000 ffffffff 00cf9800 DPL=0 CS32 [---] SS =0010 00000000 ffffffff 00c09300 DPL=0 DS [-WA] DS =0010 00000000 ffffffff 00cf9300 DPL=0 DS [-WA] FS =0023 00000000 ffffffff 00cff300 DPL=3 DS [-WA] GS =0030 00010188 ffffffff 00cff301 DPL=3 DS [-WA] LDT=0000 00000000 0000ffff 00008200 DPL=0 LDT TR =0028 80040000 00000068 80408904 DPL=0 TSS32-avl GDT= 8008b140 00000040 IDT= 80174560 00000200 CR0=80000019 CR2=0000b000 CR3=00065000 CR4=00000610 DR0=00000000 DR1=00000000 DR2=00000000 DR3=00000000 DR6=ffff0ff0 DR7=00000400 CCS=00000044 CCD=00000000 CCO=EFLAGS
- 166735
- here the processor gets a page fault exception in an attempt to execute code on EIP=0x8011e7ab (i.e. the page with this EIP appears to be currently not mapped), which is the beginning of thread_exists(). This is already very strange because the kernel uses identity mapping for the entire physical memory, including the kernel text.
check_exception old: 0xffffffff new 0xe 166735: v=0e e=0000 i=0 cpl=0 IP=0008:8011e7ab pc=8011e7ab SP=0010:8001be40 CR2=8011e7ab EAX=00000000 EBX=8012e591 ECX=00000000 EDX=800228c0 ESI=80022970 EDI=fffff000 EBP=8001be8c ESP=8001be40 EIP=8011e7ab EFL=00200002 [-------] CPL=0 II=0 A20=1 SMM=0 HLT=0 ES =0010 00000000 ffffffff 00cf9300 DPL=0 DS [-WA] CS =0008 00000000 ffffffff 00cf9800 DPL=0 CS32 [---] SS =0010 00000000 ffffffff 00c09300 DPL=0 DS [-WA] DS =0010 00000000 ffffffff 00cf9300 DPL=0 DS [-WA] FS =0023 00000000 ffffffff 00cff300 DPL=3 DS [-WA] GS =0030 00010188 ffffffff 00cff301 DPL=3 DS [-WA] LDT=0000 00000000 0000ffff 00008200 DPL=0 LDT TR =0028 80040000 00000068 80408904 DPL=0 TSS32-avl GDT= 8008b140 00000040 IDT= 80174560 00000200 CR0=80000019 CR2=8011e7ab CR3=00065000 CR4=00000610 DR0=00000000 DR1=00000000 DR2=00000000 DR3=00000000 DR6=ffff0ff0 DR7=00000400 CCS=0000002c CCD=8001be30 CCO=ADDL
- 166750
- the kernel gets a page fault exception when executing the first instruction of as_page_fault() (EIP=0x80126ab2), in an attempt to handle the previous exception (166735). Again, this should not happen because as_page_fault() is normally identity-mapped. Note ESP=0x8001bdc4.
166750: v=0e e=0000 i=0 cpl=0 IP=0008:80126ab2 pc=80126ab2 SP=0010:8001bdc4 CR2=80126ab2 EAX=00000000 EBX=8001be10 ECX=00000000 EDX=fffff000 ESI=8011e7ab EDI=fffff000 EBP=8001bde0 ESP=8001bdc4 EIP=80126ab2 EFL=00200002 [-------] CPL=0 II=0 A20=1 SMM=0 HLT=0 ES =0010 00000000 ffffffff 00cf9300 DPL=0 DS [-WA] CS =0008 00000000 ffffffff 00cf9800 DPL=0 CS32 [---] SS =0010 00000000 ffffffff 00c09300 DPL=0 DS [-WA] DS =0010 00000000 ffffffff 00cf9300 DPL=0 DS [-WA] FS =0023 00000000 ffffffff 00cff300 DPL=3 DS [-WA] GS =0030 00010188 ffffffff 00cff301 DPL=3 DS [-WA] LDT=0000 00000000 0000ffff 00008200 DPL=0 LDT TR =0028 80040000 00000068 80408904 DPL=0 TSS32-avl GDT= 8008b140 00000040 IDT= 80174560 00000200 CR0=80000019 CR2=80126ab2 CR3=00065000 CR4=00000610 DR0=00000000 DR1=00000000 DR2=00000000 DR3=00000000 DR6=ffff0ff0 DR7=00000400 CCS=00000010 CCD=00000000 CCO=LOGICL
- 166751 - 166795
- each is an attempt to handle the previous one, analogous to 166750, except that every single one of these exceptions consume a little bit of the stack. During the last one, 166795, the ESP is only 0x8001b034.
check_exception old: 0xffffffff new 0xe 166795: v=0e e=0000 i=0 cpl=0 IP=0008:80126ab2 pc=80126ab2 SP=0010:8001b034 CR2=80126ab2 EAX=00000000 EBX=8001b080 ECX=00000000 EDX=fffff000 ESI=80126ab2 EDI=fffff000 EBP=8001b050 ESP=8001b034 EIP=80126ab2 EFL=00200002 [-------] CPL=0 II=0 A20=1 SMM=0 HLT=0 ES =0010 00000000 ffffffff 00cf9300 DPL=0 DS [-WA] CS =0008 00000000 ffffffff 00cf9800 DPL=0 CS32 [---] SS =0010 00000000 ffffffff 00c09300 DPL=0 DS [-WA] DS =0010 00000000 ffffffff 00cf9300 DPL=0 DS [-WA] FS =0023 00000000 ffffffff 00cff300 DPL=3 DS [-WA] GS =0030 00010188 ffffffff 00cff301 DPL=3 DS [-WA] LDT=0000 00000000 0000ffff 00008200 DPL=0 LDT TR =0028 80040000 00000068 80408904 DPL=0 TSS32-avl GDT= 8008b140 00000040 IDT= 80174560 00000200 CR0=80000019 CR2=80126ab2 CR3=00065000 CR4=00000610 DR0=00000000 DR1=00000000 DR2=00000000 DR3=00000000 DR6=ffff0ff0 DR7=00000400 CCS=00000010 CCD=00000000 CCO=LOGICL
- 166796
- here we are in the page fault exception handler (EIP=0x8010a595), trying to push the constant 0xe on the stack. ESP is now 0x8001b000, which means that we are trying to access memory at 0x8001affc, which is already a different page. So thanks to the above recursion in exception handling, we have just overflown the original stack space, which on HelenOS/ia32 is 4KiB. Moreover, we hit a page, which is not mapped, which again is strange thanks to the identity mapping that should be in place.
check_exception old: 0xffffffff new 0xe 166796: v=0e e=0002 i=0 cpl=0 IP=0008:8010a595 pc=8010a595 SP=0010:8001b000 CR2=8001affc EAX=00000010 EBX=8001b080 ECX=00000000 EDX=fffff000 ESI=80126ab2 EDI=fffff000 EBP=00000000 ESP=8001b000 EIP=8010a595 EFL=00200002 [-------] CPL=0 II=0 A20=1 SMM=0 HLT=0 ES =0010 00000000 ffffffff 00cf9300 DPL=0 DS [-WA] CS =0008 00000000 ffffffff 00cf9800 DPL=0 CS32 [---] SS =0010 00000000 ffffffff 00c09300 DPL=0 DS [-WA] DS =0010 00000000 ffffffff 00cf9300 DPL=0 DS [-WA] FS =0023 00000000 ffffffff 00cff300 DPL=3 DS [-WA] GS =0030 00010188 ffffffff 00cff301 DPL=3 DS [-WA] LDT=0000 00000000 0000ffff 00008200 DPL=0 LDT TR =0028 80040000 00000068 80408904 DPL=0 TSS32-avl GDT= 8008b140 00000040 IDT= 80174560 00000200 CR0=80000019 CR2=8001affc CR3=00065000 CR4=00000610 DR0=00000000 DR1=00000000 DR2=00000000 DR3=00000000 DR6=ffff0ff0 DR7=00000400 CCS=00000010 CCD=00000000 CCO=LOGICL
- 166797
- here the processor encountered a double fault exception in an attempt to save previous execution state on the privileged stack; this is understandable because from the previous exception we now that the stack is not mapped.
check_exception old: 0xe new 0xe 166797: v=08 e=0000 i=0 cpl=0 IP=0008:8010a595 pc=8010a595 SP=0010:8001b000 EAX=00000010 EAX=00000010 EBX=8001b080 ECX=00000000 EDX=fffff000 ESI=80126ab2 EDI=fffff000 EBP=00000000 ESP=8001b000 EIP=8010a595 EFL=00200002 [-------] CPL=0 II=0 A20=1 SMM=0 HLT=0 ES =0010 00000000 ffffffff 00cf9300 DPL=0 DS [-WA] CS =0008 00000000 ffffffff 00cf9800 DPL=0 CS32 [---] SS =0010 00000000 ffffffff 00c09300 DPL=0 DS [-WA] DS =0010 00000000 ffffffff 00cf9300 DPL=0 DS [-WA] FS =0023 00000000 ffffffff 00cff300 DPL=3 DS [-WA] GS =0030 00010188 ffffffff 00cff301 DPL=3 DS [-WA] LDT=0000 00000000 0000ffff 00008200 DPL=0 LDT TR =0028 80040000 00000068 80408904 DPL=0 TSS32-avl GDT= 8008b140 00000040 IDT= 80174560 00000200 CR0=80000019 CR2=8001affc CR3=00065000 CR4=00000610 DR0=00000000 DR1=00000000 DR2=00000000 DR3=00000000 DR6=ffff0ff0 DR7=00000400 CCS=00000010 CCD=00000000 CCO=LOGICL
The double fault could not have been handled either, so the processor hit a triple fault and performed a reset. This is were the operating system lost control and started to appear hung.
comment:5 by , 15 years ago
We should be asking ourselves why are some parts of the kernel address space (rooted in PTL0 0x65000) not mapped. Had they been mapped as usual, the system would be fine.
comment:6 by , 15 years ago
Owner: | set to |
---|---|
Status: | new → accepted |
I think I know who the culprit is.
The page tables are corrupted, because they were freed by the address space. The problem is that when the last thread within a task exits, the scheduler calls thread_destroy(), which in turn calls task_destroy(), which in turn calls as_destroy(). On architectures with hierarchical page tables, as_destroy() destroys the page tables, but the kernel continues to use them until another address space is installed.
I verified this by printing a debug message after page tables rooted at 0x65000 (this address gets regularly assigned to one of the initial address spaces) are destroyed. When I reproduced the issue, the message appeared at the same time the system appeared hung. qemu.log pointed to 0x65000 too, which can be considered a conclusive evidence.
comment:7 by , 15 years ago
Priority: | major → critical |
---|
follow-up: 9 comment:8 by , 15 years ago
The issue is 100% reproducible with the following change:
=== modified file 'kernel/generic/src/mm/as.c' --- kernel/generic/src/mm/as.c 2010-03-23 14:41:06 +0000 +++ kernel/generic/src/mm/as.c 2010-05-01 09:59:25 +0000 @@ -249,6 +249,7 @@ btree_destroy(&as->as_area_btree); #ifdef AS_PAGE_TABLE + SET_PTL0_ADDRESS(as->genarch.page_table); page_table_destroy(as->genarch.page_table); #else page_table_destroy(NULL);
On ia32, SET_PTL0_ADDRESS has the side effect of invalidating the TLBs so the bug is much more likely to show. With this change, Qemu with one processor reboots early after the kernel banner is printed. Qemu with more CPUs hard hangs at about the same point.
comment:9 by , 15 years ago
Replying to jermar:
The issue is 100% reproducible with the following change:
=== modified file 'kernel/generic/src/mm/as.c' --- kernel/generic/src/mm/as.c 2010-03-23 14:41:06 +0000 +++ kernel/generic/src/mm/as.c 2010-05-01 09:59:25 +0000 @@ -249,6 +249,7 @@ btree_destroy(&as->as_area_btree); #ifdef AS_PAGE_TABLE + SET_PTL0_ADDRESS(as->genarch.page_table); page_table_destroy(as->genarch.page_table); #else page_table_destroy(NULL);On ia32, SET_PTL0_ADDRESS has the side effect of invalidating the TLBs so the bug is much more likely to show. With this change, Qemu with one processor reboots early after the kernel banner is printed. Qemu with more CPUs hard hangs at about the same point.
This is actually a flawed example. Instead, something like the following can be used to reproduce:
btree_destroy(&as->as_area_btree); #ifdef AS_PAGE_TABLE page_table_destroy(as->genarch.page_table); + memsetb((void *)PA2KA(as->genarch.page_table), PAGE_SIZE, 0); + if (as == AS) { + SET_PTL0_ADDRESS(as->genarch.page_table); + } #else page_table_destroy(NULL); #endif
comment:10 by , 15 years ago
There is also another similar scenario, which can lead to the observed triple fault. Imagine a thread running on cpu0. When the thread is preempted or goes to sleep, find_best_thread() until as_switch() will continue to run using its address space and CR3. But if the thread gets migrated to cpu1 and exits there, it's address space will be deallocated. But cpu0 may be still using it.
comment:12 by , 15 years ago
Resolution: | → fixed |
---|---|
Status: | accepted → closed |
According to Qemu logs, it turns out the hang can be caused by a triple fault. Now, this finally something useful. I will make an update when I learn something more.