Newsgroups: comp.arch
Path: utzoo!henry
From: henry@utzoo.uucp (Henry Spencer)
Subject: Re: Sw vs. Hw BitBlit.
Message-ID: <1988Jul28.173301.7275@utzoo.uucp>
Organization: U of Toronto Zoology
References: <399@ma.diab.se>
Date: Thu, 28 Jul 88 17:33:01 GMT

In article <399@ma.diab.se> pf@ma.UUCP (Per Fogelstr|m) writes:
>... all bus cycles doing anything else than movin data is overhead.
>Agree ??.  Okay, so even if a 68k is doing a "Blit" in straight code
>it will consume some memory bandwidth. A micro programmed "hardware" blit
>will be able to use every memory cycle for data accesses, thus having much
>higher transfer rate. Correct me if i'm wrong...

Almost right, which means "wrong".  Take out the word "much" and I'll go
along with it.  Bulk data movement, like scrolling, can be done with 68k
instructions like MOVEM, which move a couple of dozen words of data for
every instruction fetch.  Yes, avoiding the fetches would speed things up,
but not by nearly as much as you think.  People designing things like
Blitters, DMA interfaces, etc., consistently ignore just how quickly a
modern CPU can move data if the programmer really sits down and thinks
for a while about how to do it.  Most modern CPUs can nearly saturate
their buses with data movement if they really try.

>Assuming we have a fast micro (an 68030 or a NS32535) the would at least be
>supported by their on chip caches. Even if the hitrate in theese caches are
>as low as 50% an external hardware Blitter could use the other 50% ...

You miss an important point:  those caches are not there to free up external
memory cycles, they are there to help slow memory keep up with a fast CPU.
It's not at all inconceivable to get 50% cache hits (which is low for an
instruction cache but good for a tiny data cache like the 030's) *and*
complete saturation of the external memory bandwidth, when one of those
CPUs gets going.

>Someone pointed out that placing characters is the main work for the BitBlit.
>Yes, that is correct in some systems and this is a problem in many cases.
>Placing a character can take from 20micro seconds and up, and the cpu has to
>wait for the blitter to be ready before placing the next character...

This is in fact nearly irrelevant, because there are probably 200us or more
of overhead required before that 20us BitBlt.  Character drawing is a case
where BitBlt speed is irrelevant, because character drawing speeds are
TOTALLY dominated by the overhead of finding the character and deciding
where to put it.
-- 
MSDOS is not dead, it just     |     Henry Spencer at U of Toronto Zoology
smells that way.               | uunet!mnetor!utzoo!henry henry@zoo.toronto.edu
