Opened 12 years ago
Closed 9 years ago
#509 closed defect (duplicate)
Applications crash in malloc() in the recent gta02 builds
Reported by: | Jakub Jermář | Owned by: | Jiri Svoboda |
---|---|---|---|
Priority: | major | Milestone: | 0.7.0 |
Component: | helenos/kernel/arm32 | Version: | mainline |
Keywords: | gta02 | Cc: | |
Blocker for: | Depends on: | ||
See also: | #638 |
Description
As of mainline,1741, but also prior to this revision, applications crash inside malloc() when running on gta02 or similar boards.
Example of these crashes include the following call paths:
... malloc() async_send_fast() log_msg() ...
or:
area_check() malloc_internal() malloc() fat_idx_get_by_pos() ...
The crash happens on address 0x10, which suggest a NULL pointer passed to area_check().
Given that other architectures and other arm machines don't have these problems, this issue looks gta02 specific.
Attachments (5)
Change History (11)
by , 12 years ago
Attachment: | SAM_1336.JPG added |
---|
comment:1 by , 12 years ago
I thought I may put some perspective acquired by a little bit of bisecting on the behaviour of the default mainline. This perspective may or may not be relevant to this specific ticket:
1711 crashes heap [#509] <= the symptom changes [large stack support] 1710 kernel panic (bad trap) 1708 kernel panic (bad trap) 1705 kernel panic (bad trap) 1699 kernel panic (bad trap) 1692 kernel panic (bad trap) 1689 kernel panic (bad trap) 1688 kernel panic (bad trap) <= first bad revision [uspace hash table] 1687 good 1685 good 1670 good, panics upon touching the touchscreen [first noticed on an earlier revision] 1641 decoder panic, reached compositor [the decoder bug fixed fixed in later revisions]
So from the above it follows that #509 started to show with mainline,1711. Tested revisions before that, until mainline,1688 were consistently panicking in ipc_call_free() on a bad kernel trap. Revisions before mainline,1688 were known to be sometimes panicking upon a touchscreen event.
Unfortunately, this looks as if there has been a latent bug in the GTA02 support (other arm32 machines do not appear to be susceptible to any of this) which was only exploited by the above mentioned changes, namely mainline,1688, and, maybe, mainline,1711.
We should probably focus on the areas that are GTA02 specific to find the root cause of these issues.
comment:2 by , 11 years ago
As of mainline,2085, only barebone build boots on my gta02, exhibiting the above symptoms.
comment:3 by , 11 years ago
This is the boot process as of mainline,2094 (barebone), captured using the debug board:
U-Boot 1.3.2-moko12 (May 9 2008 - 10:28:48) I2C: ready DRAM: 128 MB Flash: 2 MB NAND: 256 MiB Glamo core device ID: 0x3650, Revision 0x0002 USB: S3C2410 USB Deviced mtdparts variable not set, see 'help mtdparts' mtdparts variable not set, see 'help mtdparts' mtdparts variable not set, see 'help mtdparts' mtdparts variable not set, see 'help mtdparts' mtdparts variable not set, see 'help mtdparts' mtdparts variable not set, see 'help mtdparts' HelenOS bootloader, release 0.5.0 (Fajtl), revision 2094M (m.lombardi85@gmail.com-20140401090159-xc3ilz3z42u901lq) Built on 2014-04-16 00:13:48 for arm32 Copyright (c) 2001-2014 HelenOS project Boot data: 0x30010000 -> 0x30b1b6be Memory statistics 0x30015000|0x30015000: bootstrap stack 0x30010000|0x30010000: bootstrap page table 0x30015720|0x30015720: boot info structure 0xb0a08000|0x30a08000: kernel entry point 0x30015c24|0x30015c24: kernel image (527624/149104 bytes) 0x3003a294|0x3003a294: ns image (219641/94154 bytes) 0x3005125e|0x3005125e: loader image (217799/93713 bytes) 0x3006806f|0x3006806f: init image (219682/94413 bytes) 0x3007f13c|0x3007f13c: locsrv image (226939/98110 bytes) 0x3009707a|0x3009707a: rd image (217268/93290 bytes) 0x300adce4|0x300adce4: vfs image (234595/101552 bytes) 0x300c6994|0x300c6994: logger image (223590/96005 bytes) 0x300de099|0x300de099: ext4fs image (292525/125222 bytes) 0x300fc9bf|0x300fc9bf: initrd image (29360128/10611967 bytes) Inflating components ... initrd ext4fs logger vfs rd locsrv init loader ns kernel . Booting the kernel... SPARTAN kernel, release 0.5.0 (Fajtl), revision 2094M (m.lombardi85@gmail.com-20140401090159-xc3ilz3z42u901lq) Built on 2014-04-16 00:13:48 for arm32 Copyright (c) 2001-2014 HelenOS project Detected 1 CPU(s), 131040 KiB free memory Kernel console ready (press any key to activate) Program loader at 0xf0200000 RAM disk at 0x30c52000 (size 29360128 bytes) ns: HelenOS IPC Naming Service ns: Accepting connections init: HelenOS init loc: HelenOS Location Service rd: HelenOS RAM disk server vfs: HelenOS VFS server logger: HelenOS Logging Service ext4fs: HelenOS ext4 file system server loc: Accepting connections logger: Accepting connections rd: Found RAM disk at 0x30c52000, 29360128 bytes rd: Accepting connections vfs: Accepting connections ext4fs: Accepting connections init: Root filesystem mounted on / (ext4fs at bd/initrd) init: Unable to stat /srv/tmpfs init: Starting /srv/klog init: Starting /srv/locfs [kernel/other] note: Program loader at 0xf0200000 Task klog (9) killed due to an exception at program counter 0x0000acec. r0 =0x00000000 r1 =0x002360a0 r2 =0x00000000 r3 =0x00000058 r4 =0x002360a0 r5 =0x00000010 r6 =0x00000000 r7 =0x0002e000 r8 =0x00000040 r9 =0x0013204c r10=0x00000000 fp =0x00233ecc r12=0x00233ed0 sp =0x00233e90 lr =0x0000b198 spsr=0x20000050 0x00233ecc: 0x0000acec() 0x00233efc: 0x0000b728() 0x00233f14: 0x0000bb44() 0x00233f7c: 0x000105dc() 0x00233fbc: 0x00006b48() 0x00233fdc: 0x000010ec() 0x00233ff4: 0x00001b6c() Kill message: Page fault: 0x00000010. locfs: HelenOS Device Filesystem locfs: Accepting connections init: Unable to stat /srv/taskmon init: Location service filesystem mounted on /loc (locfs) init: Temporary filesystem unknown type (tmpfs) init: Starting /srv/devman devman: HelenOS Device Manager devman: Accepting connections. root: HelenOS root device driver init: Unable to stat /srv/apic init: Unable to stat /srv/i8259 [devman] note: The `root' driver was successfully registered as running. init: Unable to stat /srv/obio init: Unable to stat /srv/cuda_adb init: Unable to stat /srv/s3c24xx_uart Task root (12) killed due to an exception at program counter 0x0001d874. r0 =0x00000000 r1 =0x002440a0 r2 =0x00000000 r3 =0x00000058 r4 =0x002440a0 r5 =0x00000010 r6 =0x00000000 r7 =0x0003c000 r8 =0x00000040 r9 =0x0003cb3c r10=0x00000000 fp =0x00241dec r12=0x00241df0 sp =0x00241db0 lr =0x0001dd20 spsr=0x20000050 0x00241dec: 0x0001d874() 0x00241e1c: 0x0001e2b0() 0x00241e34: 0x0001e6cc() 0x00241e9c: 0x00023568() 0x00241edc: 0x00019808() 0x00241ef8: 0x00002dd4() 0x00241f34: 0x00001144() 0x00241fa4: 0x00001ce0() 0x00241fdc: 0x0001fc30() 0x00241ff4: 0x000145d0() Kill message: Page fault: 0x00000010. init: Starting /srv/s3c24xx_ts s3c24xx_ts: S3C24xx touchscreen driver s3c24xx_ts: device at physical address 0x58000000, inr 31. s3c24xx_ts: Registered device hid/mouse. s3c24xx_ts: Accepting connections init: Unable to stat /srv/loopip init: Unable to stat /srv/ethip init: Unable to stat /srv/inetsrv init: Unable to stat /srv/tcp init: Unable to stat /srv/udp init: Unable to stat /srv/dnsrsrv init: Unable to stat /srv/dhcp init: Unable to stat /srv/nconfsrv init: Unable to stat /srv/clipboard init: Unable to stat /srv/remcons init: Starting /srv/input input: HelenOS input service input: Could not find any suitable input device ######> Kernel panic on cpu0 due to a failed assertion: <###### waitq_sleep_timeout() at generic/src/synch/waitq.c:264: (!PREEMPTION_DISABLED) || (PARAM_NON_BLOCKING(flags, usec)) THE=0xb0586000: pd=2 thread=0xb0325a00 task=0xb0584000 cpu=0xb02eb400 as=0xb0009294 magic=0xfacefeed thread="uinit" task="input" 0xb0587c14: generic/src/debug/stacktrace.o:stack_trace()+0x0000001c 0xb0587c44: generic/src/debug/panic.o:panic_common()+0x000001ac 0xb0587c84: generic/src/synch/waitq.o:waitq_sleep_timeout()+0x00000154 0xb0587c94: generic/src/synch/semaphore.o:_semaphore_down_timeout()+0x00000010 0xb0587cdc: generic/src/synch/mutex.o:_mutex_lock_timeout()+0x0000003c 0xb0587d1c: generic/src/mm/as.o:as_page_fault()+0x00000068 0xb0587d4c: arch/arm32/src/mm/page_fault.o:data_abort()+0x00000210 0xb0587d84: generic/src/interrupt/interrupt.o:exc_dispatch()+0x00000104 0xb0587d9c: arch/arm32/src/ras.o:ras_check()+0x00000030 0xb0587e3c: arch/arm32/src/exc_handler.o:data_abort_exception_entry()+0x000000b4 0xb0587e84: generic/src/mm/slab.o:_slab_free()+0x000000f8 0xb0587e9c: generic/src/ipc/ipc.o:ipc_call_free()+0x0000003c 0xb0587ef4: generic/src/ipc/sysipc.o:sys_ipc_wait_for_call()+0x000001b0 0xb0587f3c: generic/src/syscall/syscall.o:syscall_handler()+0x000000cc 0xb0587f64: arch/arm32/src/exception.o:swi_exception()+0x00000034 0xb0587f9c: generic/src/interrupt/interrupt.o:exc_dispatch()+0x00000104 0xb0587fb4: arch/arm32/src/ras.o:ras_check()+0x00000030 cpu0: halted
We can see that klog
and root
tasks crashed (binaries to be attached) and the kernel crashed. Non-barebone builds end with:
Inflating components ... initrd initrd: Inflating error -14
comment:4 by , 10 years ago
Milestone: | 0.6.0 → 0.7.0 |
---|
comment:5 by , 10 years ago
Running mtest
from the u-boot prompt report a memory error for the following addresses:
0x33ED8198 0x33ED819C 0x33ED81A0 0x33ED81A4 0x33ED81A8
This is approximately at 62.8 MB of physical memory. Going to test whether disabling this page will rid us of at least some of these issues.
comment:6 by , 9 years ago
Resolution: | → duplicate |
---|---|
See also: | → #638 |
Status: | new → closed |
Screenshot depicting one of these crashes, mainline,1741