Usenet.com

www.Usenet.com

Group Index

Comp Thread Archive from Usenet.com

<-- __Chronological__ --> <-- __Thread__ -->

Re: 1teraflops cell processor possible?



Iain McClatchie <[EMAIL PROTECTED]> wrote:
+---------------
| Nick> My solution to that was an architectural requirement for a 'yield'
| Nick> instruction to be called every (say) M instructions or N memory
| Nick> references, whichever comes first, and to abort the process if it
| Nick> failed to do so.  Dead easy to implement.
| 
| So consider a machine which, every M cycles, inserts a "yield"
| instruction into the instruction stream.  This happens in instruction
| fetch, and there is no PC associated with the yield.
+---------------

See my parallel reply to Nick, where I point out that there are *advantages*
to manual (or, for compiled code, at least explicit) placement of the
yield ops, namely: (1) Code between yields forms an implicit critical
section, thus requiring no other synchronization primitives for inter-task
communication (on a single CPU, at least); (2) the protocol between the
user code and the "scheduler" (what runs when a "yield" needs to actually
do something) can be designed to minimize the state which must be saved
and restored across a yield, speeding up both the user code and the yield
handler.

Having the yield be done at random places in the instruction stream
(random from the point of view of the user or the compiler) destroys
both of these advantages.

<aside>
  I can't count the number of times I've wanted the converse of a "yield"
  for Unix user processes, that is, a *cheap* way [cheaper than a system
  call] to tell the kernel "don't interrupt or reschedule me for the next
  few XXX microseconds". [To avoid bugs or DOS attacks, of course, that
  operation would have to explicitly *allow* an interrupt/reschedule at
  the moment it was executed, of course.]
</aside>

+---------------
| Now you don't have to worry about proving that code paths have length
| M or less and so forth.
+---------------

See my parallel reply to Nick. It's not so bad, for many applications.
In fact, there's quite a bit of existing published literature about
deferring stack overflow checks and deferring garbage collection [and
maybe also automatic profiling or "metering"] that would probably be
applicable to the "yield" problem as well.

+---------------
| My guess is that compilers would end up forced to stick yield
| instructions into nearly every basic block...
+---------------

Not necessarily -- only those blocks which end with a *backwards*
branch. For branches that implement an N-way fork/merge flow path,
the compiler can just keep track of the maximum duration of any
fork, for example. [But also see the above-mentioned papers on
deferring other kinds of tests, which cover the case when one fork
is much more expensive than the others.]

But even if you did have to...

+---------------
| which would be a bummer.
+---------------

Not necessarily, depending on the cost of the "yield" op. In the case
I cited in my parallel reply, the cost was the same as an ordinary integer
operation.


-Rob

-----
Rob Warnock                     <[EMAIL PROTECTED]>
627 26th Avenue                 <URL:http://rpw3.org/>
San Mateo, CA 94403             (650)572-2607




<-- __Chronological__ --> <-- __Thread__ -->


Usenet.com



Please check out one of the premium Usenet Newsgroup Service Providers below for access to Usenet.