> Threads should already have separate stacks don't they? There are two type of ...

> Threads should already have separate stacks don't they?

There are two type of threads: cooperative and preemptive/parallel. Coroutines with stacks represent cooperative threading. There's only ever one actual hardware thread, so there's never any race conditions or locks. Depending on whether you have one core that has an interrupt-driven kernel switching contexts for you (preemptive), or a multi-core system that's actually running two threads at the same time; the latter model protects you against a full application/system stall if one thread stops responding (hi, OS9), and it also allows scalability for parallel tasks.

> As for the calling a function that does the yielding for you, people tend to call that "stackful" coroutines.

People have a million names for it, unfortunately. It's a simple yet immensely powerful concept, although it's not popular. So people keep reinventing it (due to not knowing it exists already) and coming up with their own names for it. You'll also hear them called cothreads, fibers, green threads, etc.

But at the end of the day, they're all variant names for cooperative threads.

> This imposes some restrictions in how the runtime is implemented.

Getting access to memory for a new stack isn't a problem in practice. Outside of tiny DSPs with dedicated call stacks (eg NEC 7725), I've yet to encounter a processor where you couldn't just allocate a new block of heap memory and use it for another thread's stack space. x86, amd64, ppc32, ppc64, arm, mips, sparc, etc.

Now you can claim that the new memory won't automatically grow at exhuastion. Well, you can mmap things and catch exceptions to expand it. But really, the default these days is 512K - 1M for your main stack thread anyway in C; and they generally only reserve a max of 8M short of special compiler flags. Just allocate a large block for your extra threads and you'll be fine.

I've also heard of some tricks with setjmp/longjmp that involve subdividing the main stack, but there's really no need for that at all.

A runtime can't safely use purely stack-relative addressing anyway, since there's no telling how much of an offset there is due to non-runtime function recursion already when the runtime functions are invoked. Unless it's a very high-level language, in which case there's a myriad of better ways to handle this anyway.