Threaded IO on the MacOS without Bleeding from the Ears
I'll explain input/output.
I'll explain the MacOS's various IO models.
Synchronous
Asynchronous
I'll explain how to extend the Thread Manager for effective IO
Finally, I'll explain how to couple Asynchronous IO with an extended Thread Manager
to realize Threaded IO
Input/Output (IO) Explained
Whenever you need to move information into and out of your Mac's RAM, you have to
deal with IO issues.
Examples of IO devices
SCSI devices like hard drives and CD-ROM drives
Serial devices like modems, printers and other Macs
ADB devices like mice and keyboards
The MacOS's IO Programming Interface
The MacOS uses chunks of code called drivers to interface with the hardware
Since drivers are rather low-level, Apple usually builds "Managers" --high level interfaces--
on top of the drivers.
The SCSI Manager
The Serial Manager
The File Manager
The MacOS's Serial IO Programming Interface
We'll concentrate on the MacOS's Serial Drivers
Why?
The Serial Drivers have always been Asynchronous
The SCSI Manager (and thus the File Manager) only recently went Asynchronous -- and
only on 68040 Macs and better.
Serial ports are slow, and benefits greatly from Asynchronous IO
The Threaded IO concepts apply to all MacOS IO, including the File Manager
IO Programming Example
To help contrast the differing IO models, we'll code the same task to each model.
We'll attempt to reset the modem attached to the modem port. We'll accomplish this
by writing "ATZ\r" to the modem port.
What does Synchronous IO mean?
When you execute IO synchronously, the MacOS executes the job and loops inside a routine
called vSyncWait until the job is completed.
The effect presented to your program and your user is that the Mac halts until the
job is completed.
Synchronous IO Example
First we'll open the modem port's serial drivers.
There are two drivers for each serial port -- an input driver and output driver. The
output driver should always be opened first.
Next we'll write "ATZ" to the modem port.
Finally we'll close the modem port's serial drivers.
Opening the Modem Port's Serial Drivers
Like files, when you open a driver, a "reference number" is returned. You use this
reference number whenever referring to the driver.
short inRefNum = 0, outRefNum = 0; OSErr err = OpenDriver( "\p.AOut", &outRefNum ); if( !err ) err = OpenDriver( "\p.AIn", &inRefNum );
We initially set inRefNum and outRefNum to zero to indicate the reference numbers
are invalid. Bad things happen if you attempt to close a driver passing an invalid
reference number.
Writing to the Modem Port's Output Driver Synchronously
We'll send the reset command
Str255 resetCmd = "\pATZ\r"; if( !err ) err = FSWrite(outRefNum, resetCmd[0], resetCmd+1); Foo();
Foo() will be called once FSWrite() completes.
Closing Modem Port's Serial Drivers
Now we'll close both the input and output drivers, input side first.
if( inRefNum ) (void) CloseDriver( inRefNum ); if( outRefNum ) (void) CloseDriver( outRefNum );
Note how we don't attempt to close the drivers unless the reference numbers are non-zero.
Only a non-zero reference number is valid.
The Summary of Synchronous IO
Advantages
Very easy to program.
Disadvantages
Your computer is effectively frozen while the job completes.
What does Asynchronous IO mean?
When you execute IO asynchronously, the MacOS places your job in a queue and hands
control back to your program immediately.
You supply a callback (called a "completion routine" in MacOS IO parlance) to be called
when the job is completed.
The effect presented to your program is the Mac executes the job while your code executes.
Explanation of the ParamBlockRec union
Almost every asynchronous routine on the MacOS takes a pointer to a ParamBlockRec
union. This is known as the "parameter block."
union ParamBlockRec { IOParam ioParam; FileParam fileParam; VolumeParam volumeParam; CntrlParam cntrlParam; SlotDevParam slotDevParam; MultiDevParam multiDevParam; };
Explanation of the ParamBlockRec union
The different fields of ParamBlockRec are used by different routines for different
purposes.
The ioParam field is used for transport devices like the serial driver.
The fileParam field is used for the File Manager
The volumeParam is used for managing storage volumes like floppies, hard drives, CDs,
etc.
The cntrlParam is used for controlling the drivers themselves.
We're interested in the ioParam field, and thus the IOParam structure.
Explanation of the IOParam structure
The IOParam structure.
QElemPtr qLink; short qType; short ioTrap; Ptr ioCmdAddr; IOCompletionUPP ioCompletion; // your callback OSErr ioResult; // job's status StringPtr ioNamePtr; short ioVRefNum; short ioRefNum; // reference number SInt8 ioVersNum; SInt8 ioPermssn; Ptr ioMisc; Ptr ioBuffer; // the buffer long ioReqCount; // size requested long ioActCount; // size completed short ioPosMode; long ioPosOffset;
Asynchronous IO Example
// open driver code here ParamBlockRec pb; pb.ioParam.ioCompletion = MyCompletion; pb.ioParam.ioRefNum = outRefNum; pb.ioParam.ioBuffer = (Ptr) resetCmd+1; pb.ioParam.ioReqCount = resetCmd[0]; if( !err ) err = PBWriteAsync( &pb ); Foo(); // close driver code here
Foo() will be called immediately.
pb.ioParam.ioResult is set to 1 when successfully installed into the driver's queue.
When completed, ioResult contains an error code.
pb.ioParam.ioCompletion is called when the job completes.
Asynchronous IO Summary
Advantages
Your program continues to execute while IO completes.
Disadvantages
Uses error-prone parameter blocks.
Completion Routines are executed at interrupt time, and are subject to the restrictions
of interrupt time code (can't use handles, can't use QuickDraw, can't access globals
without extra code, etc).
What do you do while your IO job is executing? Call WaitNextEvent()? Spin the cursor?
Ideally, you'd get some real work done.
The Thread Manager
The Thread Manager implements threads on the MacOS
The thread manager is a great way to divide up tasks your program performs. Compiling,
printing, saving and downloading all are great candidates for threading.
The Thread Manager is cooperative -- a thread must specifically yield control to another
thread in order for the model to work. It's like the MacOS's cooperative multitasking.
The Thread States Explained
The thread manager defines three mutually exclusive thread states.
Stopped
The thread can not be selected when your thread yields.
Ready
The thread can be selected to run when your thread yields.
Running
The currently executing thread.
The Threaded IO Model
The goal is to write a function that executes an IO job asynchronously, stopping our
thread.
When the IO job completes, your callback executes and sets your thread's state to
ready.
In this model, your program's other tasks (compiling, downloading, printing, etc.)
continue to execute while your IO job executes.
Ideal Threaded IO Example
// open driver code here // parameter block setup code here if( !err ) err = PBWriteAsync( &pb ); if( !err ) err = SetThreadState(thisThread, kStoppedThreadState, kNoThreadID); // close driver code here
However, there's a window of death here. Between PBWriteAsync() and stopping our thread,
the IO job can and will complete, executing our callback.
The Window of Death
Ideally, the execution of the job looks like
Asynchronously Write
Stop Thread
Callback is executed, set thread to ready
However, this course of execution is possible
Asynchronously Write
Callback is executed, set thread to ready
Stop Thread
Your thread is stopped, and will never wake up. Your thread is dead!
Combating the Window of Death (PowerPlant)
PowerPlant, Metrowerks' C++ framework defers the callback.
The callback checks the state of the IO thread. If the thread is not stopped, PowerPlant
sets a Time Manager task to execute 100 microseconds in the future. Hopefully by
then the thread will be stopped.
This is a good work-around, however it complicates the callback.
Combating the Window of Death (develop)
develop
, Apple's technical journal, had an article on the Thread Manager. They advocated
a dual-thread solution.
There's two threads per IO job -- the IO thread and the waker thread.
The IO jobs stops the waker thread. The IO thread executes the IO job and stops itself.
The callback readies the waker thread.
The waker thread wakes the IO thread.
This is a poor work-around. You have to manage two threads per job and the scheduling
overhead to run the IO thread is too high.
Combating the Window of Death (Polling)
Using this technique, the thread is never stopped.
After executing the IO job, the thread simply loops, testing the ioResult field in
the parameter block. The thread continually yields until ioResult is less than 1.
This method is as fast as PowerPlant's method and doesn't require a callback. This
would be the best work-around.
Problems with the Threaded IO Models
The Thread Manager wasn't designed with IO in mind. Ideally, none of these models
would be necessary -- we'd use the Ideal Thread Model.
The latency of the models is high. Imagine you have 25 threads running. You execute
your IO and stop the thread. Even if the IO completes immediately, you have to wait
your turn behind 24 other threads. One of those threads is the event loop, which
may switch out your application.
Enhancing the Thread Manager for Effective IO
Metaphysical question: What does it mean to stop a thread?
Thread Manager's answer: mark it as ineligible for scheduling and schedule another
thread.
My solution: write a routine that simply marks a thread as ineligible for scheduling
-- but doesn't reschedule. This puts your thread into a known state
before
executing an IO job.
Extending the Thread Manager
SetXThreadState() is the new routine. It works just like SetThreadState() except it
doesn't reschedule.
This line of execution works:
SetXThreadState( currentThread, kIneligible );
PBWriteAsync( &pb );
YieldToAnyThread();
Callback: SetXThreadState( ioThread, kEligible );
So does this:
SetXThreadState( currentThread, kIneligible );
PBWriteAsync( &pb );
Callback: SetXThreadState( ioThread, kEligible );
YieldToAnyThread();
Prioritizing Threads
However, our latency is still high. We want our thread to be the first thread in line
when the IO job is completed.
Solution: be able to mark a thread as "priority".
A priority thread is the same as an eligible thread -- however the scheduler always
schedules a priority thread first.
When our IO callback executes, it will make the thread as priority.
How do you do it?
The Thread Manager provides a hook for you to place your own scheduler.
However, the Thread Manager's data structures are completely opaque -- there's no "thread
queue" to access from within the scheduler. You can't even access a reference constant
given a ThreadID!
So, even though we can replace the default scheduler, we don't know what to schedule!
Creating a Thread Queue
There is a way -- create and maintain your own thread queue.
The Thread Manager provides three hooks meant for debugging. DebuggerNotifyNewThread(),
DebuggerNotifyDisposeThread() and DebuggerNotifyScheduler(). We'll plug into these
hooks.
We'll define three thread queues: an ineligible queue, an eligible queue and a priority
queue.
Maintaining the Thread Queue
When our DebuggerNotifyNewThread() is called, we'll add an element to the eligible
queue with the new thread's ID.
When our DebuggerNotifyDisposeThread() is called, we'll remove the element that matches
our disposed thread's ID.
When our DebuggerNotifyScheduler() is called, we'll look at our priority queue. If
there's a priority thread waiting, we'll move it into the eligible queue and schedule
it. Priority status should be fleeting -- otherwise it will hog the processor.
Otherwise, we check out eligible queue for waiting threads.
The Thread Queue Code
We'll create the XThreadElem structure to hold individual thread queue elements.
struct XThreadElem { XThreadElemPtr next; ThreadID threadID; };
We can use the standard MacOS Queue utilities to manipulate our queues.
We'll store all three queues in one handle as an array of XThreadElem structures.
We'll use the standard MacOS queue utilities to manage the elements.
We'll keep the handle locked because the Thread Queue routines will be called at interrupt
time.
The Thread Queue Code
We'll create a XThreadQueue structure to manage each of our three queues
struct XThreadQueue { short ignored; XThreadElemPtr head; XThreadElemPtr tail; XThreadElemPtr mark; };
This is a standard MacOS queue header expect for the mark field. When moving through
the eligible thread queue, the mark field points to the next thread to schedule.
Writing the SetXThreadState() routine
All our SetXThreadState() routine has to do is place the thread in question into the
correct queue. Duck soup.
Threaded IO (Finally!)
We have extended the Thread Manager to effectively support IO. Now to write the IO
code.
We want two new routines: a threaded read routine and a threaded write routine.
It turns out these routines are very similar, the only difference is the actual Device
Manager call they make: PBRead() or PBWrite().
ThreadedRead()
Here's the ThreadedRead() function prototype: OSErr ThreadedRead( short refNum, void *buffer, long *size, long offset, long patience
);
The only real new thing here is the patience argument. You can specify how long you'll
willing to wait for the IO job to complete.
This function is very similar to FSRead(), except you get the advantages of threaded
IO
ThreadedRead() Under the Hood
This is what ThreadedRead() does:
Marks itself ineligible using SetXThreadState()
Creates and fills out a parameter block
If a patience was specified, creates a Time Manager task set to kill the IO job if
it doesn't finish in time.
Calls PBRead()
Calls YieldToAnyThread()
Summary of Threaded IO
Easy
Fast
Extends the Thread Manager -- all your current Thread Manager-based code will work.