hngopher.com

       [HN Gopher] The ZedRipper: Part 2
       ___________________________________________________________________
        
       The ZedRipper: Part 2
        
       Author : zdw
       Score  : 26 points
       Date   : 2021-01-02 00:05 UTC (22 hours ago)
        
 (HTM) web link (www.chrisfenton.com)
 (TXT) w3m dump (www.chrisfenton.com)
        
       | fentonc wrote:
       | If anyone's got questions about my extensive upgrades to the
       | world's least useful laptop, fire away
        
         | jhgorrell wrote:
         | Thank you for writing this up - as someone who is getting
         | started with FPGAs, it is an inspiration.
        
           | fentonc wrote:
           | Thanks! The actual logic required for the Z80 core is tiny -
           | you could probably make a dual-core version with something
           | like this: https://www.tindie.com/products/tinyvision_ai/updu
           | ino-v30-lo...
           | 
           | FPGAs are super fun to play around with.
        
             | jhgorrell wrote:
             | I ended up getting an Arty-A7 as I wanted to have an
             | ethernet port. It arrived a couple weeks ago. Still getting
             | the build and toolchain built.
             | 
             | My goal is an array of 6502s - It was when I was
             | researching that idea and dev boards I found your posts.
        
         | Lerc wrote:
         | What's the minimum workload that can be transferred to another
         | processor for a speed gain?
         | 
         | For instance can you do little things like floating point x _y
         | + u_ v by bumping the subexpressions to separate units and have
         | the parallelism outweigh the communication cost for a net gain.
        
           | fentonc wrote:
           | That's an interesting question that I haven't explored much -
           | the network on the ZedRipper is a unidirectional synchronous
           | ring operating at the full 140 MHz, with a round-trip latency
           | of ~32 clocks or so, but the interface exposed to the Z80 is
           | a sort of re-targetable serial port (you write an 8-bit
           | 'destination' register, and then you push bytes to that
           | node). The current buffer depth on the receive side is only a
           | single byte, so the sender needs to wait until the
           | destination node has read the byte and the credit gets
           | returned. Turbo Pascal uses the 'Real48' format for floating
           | point - 6 bytes per number - and I believe floating point
           | operations take several thousand clock cycles. So in a tight
           | loop on both sides, you might transfer a floating point
           | number to a neighboring node in ~500 cycles.
           | 
           | Especially if I improved the network a bit - deeper receive
           | buffers at a minimum, maybe a simple DMA engine - you could
           | probably get it down to <100 cycles to forward a Real48 to a
           | neighbor. The performance of emulated floating point on an
           | 8-bit CPU is sufficiently bad, and the network performance is
           | sufficiently good, that you probably could get away with some
           | very fine-grained parallelism that way! When I'm back to
           | commuting, I should write an n-body gravity simulator or
           | something for it so that there is lots of numerical work to
           | spread around, and see how much of a speedup I can get.
        
         | tomcam wrote:
         | How fast does Turbo Pascal feel on this machine? I didn't even
         | know it ran on a Z-80. Also, you are a straight up beast.
        
           | fentonc wrote:
           | I have a real Kaypro 2 computer with a 4MHz Z80 in it, which
           | I also use Turbo Pascal on - on the Kaypro, it's perfectly
           | usable, but you get used to waiting a few seconds when you're
           | loading files, compiling, etc. On the ZedRipper, when things
           | are executing out of RAM everything is instantaneous. I think
           | the CPU core I'm using is close to cycle-accurate, so it
           | probably is ~35x faster than the Kaypro when executing actual
           | code.
        
             | tomcam wrote:
             | Thank you. I absolutely refuse to get into retro computing.
             | I refuse. I'm not going to. So I don't think your amazing
             | work has me at all interested. I have enough hobbies, and
             | my wife knows it.
        
       ___________________________________________________________________
       (page generated 2021-01-02 23:01 UTC)