Coroutines

Front page | perl.perl6.language | Postings from May 2003

Thread Previous | Thread Next

From:

John Macdonald

Date:

May 22, 2003 00:35

Subject:

Coroutines

Message ID:

20030521230151.B2457@algate.perlwolf.com

The last weekly perl6 summary listed a discussion
on coroutines; which lead me to scan the archive and
to subscribe.

One item that I gather from the discussion is that the
current plan is to have a coroutine be resumed simply
by calling it again.  This, I think, is a mistake.
Coroutines should be identified by an object that
represents the execution state of the coroutine.

One of the most useful uses of coroutines I've had
in the past is for chaining together a data flow.
This is similar to a Unix pipeline, except of
course that it is being done inside a single program
instead of with separate programs.  Each coroutine
would resume its predecessor coroutine whenever it
needed additional data to process, and would resume
its successor coroutine whenever it had prepared
data ready to be passed on.

In either a shell pipeline or a coroutine chain,
you might use the same subprocess more than once.
A pipeline might use the same program more than once
(tr or sed or awk or perl are all likely candidates);
a coroutine chain might use a single subroutine more
than once (perhaps a symmetric encryption routine,
a grep routine, or a file insertion routine).

If you resume by simply calling the subroutine again,
which one are you calling?  For that matter, why
should one routine necessarily know which routine
will be the next one?  If a coroutine is passing
lines of text to its successor, it might wish to
insert a coroutine to read the contents of a file
between itself and its successor (maybe it just found
a #include directive) - why should the successor have
to know to call/resume a different function for a
while until the included file is fully read, and then
go back to call/resuming this original predecessor?

So, using a token (object) to represent the coroutine
provides both the ability to have the same subroutine
active in two coroutines at the same time, but also
means that one coroutine need not know the actual
subroutine that it is exchanging data with.

(This idea of a pipeline of data massaging routines
also reminds me of the perl5 Filter package.)

Restricting yourself to an iterator viewpoint of
coroutines tends to emphasize routines that generate
a sequence of values while deemphasizing the equally
useful set of routines that consume a sequence
of values.

The coroutine package I used was a subroutine library
for the B language (the precursor to C).  It used
an explicit routine to create a new coroutine - it
allocated a private stack, remembered the coroutine
that did the creation, and then invoked the top level
subroutine with its initial arguments.

There were two types of relationships between
coroutines managed by the library.  A parent
process created and managed a coroutine or group of
coroutines.  A group of coroutines acted as siblings.

Between siblings, the resume call:

    ret = resume( coroutine, value );

would resume a selected coroutine.  The value
parameter passed in by this routine would be the
ret value returned to the coroutine that was being
resumed.  (In the B subroutine it was a single scalar
value, but in Perl it could be either a scalar or
a list.)

A coroutine often would use the function caller()

    called_from = caller();

to find which coroutine invoked it, so that it would
know which coroutine to resume when it had finished a
unit of processing.

A parent routine used create() and reattach() to manage
its children, while a child used detach() to revert
to its parent.

    ret = create( size, function, args... );
    child = caller();

    ret = reattach( achild, value );

    ret = detach( value );
    parent = caller();		# can do this right away
    ...
    parent = invoker();		# or can do this later
    				# possibly in a resume'd sibling

The routines detach() and reattach() were essentially
the same as resume, but had slightly different
side effects.  detach() didn't take an argument to
specify the coroutine to be resumed but used the
remembered parent.  The reattach() routine would
resume the specified child, setting its rememered
parent to the routine (possibly different from the
original creator) that reattached it.  (The resume()
function, as well as remembering the caller's identity
in case the callee uses the caller() function to find
it, also sets the parent of the calle to be the same
as the parent of the caller.)  The create() function
would allocate a new stack of the specified size (perl
doesn't believe in such fixed limits, of course), set
up the stack frame to be consistant with a normal call
to the specified function with the given arguments,
and then does that same action as reattach to that
stack frame.

The functions caller() would return the value of
the caller (the last coroutine of any sort to resume
this coroutine in any way - that includes create(),
detach(), or reattach, as well as resume()).
The function invoker() would return the value of
the parent process that directly or indirectly did
a create() or reattach() of this process.

Having separate syntax for the creation and resumption
of a coroutine provides much the same advantage
as the C library's separation of fork and exec.
Keeping them separate allows a much more flexible
set of ways of using them.

Thread Previous | Thread Next