Simplify the SYS_THREAD_CREATE syscall interface
Removed the beefy uarg structure. Instead, the syscall gets two
parameters: %pc (program counter) and %sp (stack pointer). It starts
a thread with those values in corresponding registers, with no other
fuss whatsoever.
libc initializes threads by storing any other needed arguments on
the stack and retrieving them in thread_entry. Importantly, this
includes the address of the thread_main function which is now
called indirectly to fix dynamic linking issues on some archs.
There's a bit of weirdness on SPARC and IA-64, because of their
stacked register handling. The current solution is that we require
some space *above* the stack pointer to be available for those
architectures. I think for SPARC, it can be made more normal.
For the remaining ones, we can (probably) just set the initial
%sp to the top edge of the stack. There's some lingering offsets
on some archs just because I didn't want to accidentally break
anything. The initial thread bringup should be functionally
unchanged from the previous state, and no binaries are currently
multithreaded except thread1 test, so there should be minimal
risk of breakage. Naturally, I tested all available emulator
builds, save for msim.