Contents of Creating and Destroying Contexts: mpthd and mpthd

Creating and Destroying Contexts: mpthd $\_init$ and mpthd $\_dinit$

All may sound fine and dandy, until we write a program and want to switch to another thread. What thread? How do threads get started? How do threads get stopped? We need a means of getting threads going.

Lubomir Bic⁵ describes one functionally nice way to initiate parallel processes using a cobeginS₀| S₁|...| S_ncoend construct which executes n segments S_i of sequential code in parallel. However, this approach may require a major modification in the syntax of ``C'' or a complex function call with a variable number of arguments. More importantly, the forced nesting of the construct lacks the expressive power of other approaches...

A more general (but more complex) approach would be to use fork, join, and quit primitives. In this interface, an execution flow may be forked at a point in the code, or two or more flows later joined; superfluous flows can also be quit (terminated). UNIX provides a variant on this interface with its fork, wait, and exit primitives. The UNIX fork can appear anywhere in a block of code, and after the fork, both processes have access to all local variables currently defined. This is done by making an almost exact copy of the process dataspace (both stack and heap) into an identical but separate address space. This technique does an excellent job at protecting processes from each other, but it makes shared memory extremely difficult, is slow, and requires a lot of special hardware (virtual memory support) not usually found on personal computers.

For these reasons, I provide the primitive mpthd $\_init$ which simply creates a thread data object (see mpthd.c) which the caller can manipulate as data. The mpthd $\_init$ function nor its corresponding mpthd $\_dinit$ function neither start nor stop the thread. But the caller can, by explicitly switching to the new thread when appropriate. In this way the mpthd $\_switch$ routine, which suspends and resumes threads, is also be used to start and stop them (indeed, there is no difference).

Now we've got threads going, but how do we tell them what to do? And how do we pass them arguments? The UNIX fork provides a beautiful way to specify the starting point of for a newly forked process (the point of fork), and allows the forking user to pass arguments to the new process via local variables. However, as mentioned previously, that beauty has a high price: essentially all data needs to be compiled with the help of extensive hardware support. Even if we could afford the processing cost, there is no easy way to emulate this on a personal computer as simply copied pointers will point to their old dataspace. So, keeping consistent with ``C'', mpthd $\_init$ instead identifies the new code to be executed with function pointer. When the thread is invoked, it calls the function, and as ``C'' is statically scoped we need not worry about duplicating local variables (a function cannot directly access variables in higher scopes). But what do we do when the function returns? We simply require the function to return a new thread id which we switch to when the function returns; and if the switch returns, we just call the function again. (Thus, in this special thread function (THDFN), the return argument is quasi-equivalent to its return address – but, unlike normal function calls, this one doesn't have to return.

And what about arguments? We give each thread a little heap which lives in the space leftover by its stack. In this local heap, a creating thread can store arguments for the new thread, and after a thread has executed, it can leave arguments in its heap for other threads to read. Admittedly, this technique is a bit awkward, but it is a lot more efficient than copying the entire data space.

In the implementation of mpthd $\_init$ , there are a few interesting things to note. Notice the caller must also specify the size of buffer it intends to store the thread context in. The first use of this information is to figure out where to start the new stack, since on most machines, the stack grows downward and must be started from the top of the buffer. But another use of this number is to see if the context will be large enough for any thread to run; as most personal computer operating system calls use the caller's stack to do their work, a reasonable size stack will be needed for even the simplest "Hello, World!" function.

And in the implementation of mpthd $\_dinit$ , also notice that function waits until the processor can temporarily claim the thread. In other words, it waits to be sure processor is executing the thread: you don't destroy the house while you (or someone else) is in it.

Finally, a last note about the interface of all the initialization and deinitialization functions. Notice they all take/return a pointer to a buffer in which to store the structure, rather than dynamically allocate the structure with, say, a malloc command. The reason is three fold: decreased operating system/``C'' environment dependence; the caller may want to do his own (perhaps static) allocation for speed, or want to store the structure in his own field; and finally, dynamic memory allocation is a shared resource, and we haven't gotten to the proper management of shared resources yet (see upcoming articles).