=======================[ JonSharp.net:70:baremetal_mac... ]===========
       
       I've been tinkering with 68000 assembly off and on (mostly off) over
       the past decade and one of my favorite projects is my progressive
       attempts at bare-metal (non-Toolbox) programming my Macintosh Plus.
       I've been revisiting this idea again more recently and really want to
       push this idea further.  In the meantime, enjoy the following two
       installments originally posted on my blog:
       
        ## Bare-metal Macintosh Programming - Part 1
       
       The original compact Macintoshes (128k, 512k, and Plus) have long been
       a subject of my collection and study, but after acquiring a Canon Cat,
       I began to see the Mac in a slightly new light. The original Macintosh
       may be a truly unique blend of hardware and software engineering, but
       what if you looked at the Mac only for its hardware?
       
       If you know the Mac, you know that it was innovative in large part
       because of its ROM. The Toolbox functions of the ROM really made the
       Mac what it was. As a result, The MacOS (System) software is
       inextricably linked to the ROM code, and the two work together in a
       clever balance of pointers and patched code. This cleverness and tight
       engineering allowed the Mac to do a lot with its relatively meager
       resources.
       
       But what if you threw out the ROM and its Toolbox? The Mac might start
       to look not too different from other 68000-based personal computers of
       its time.  What would you do if someone handed you a Macintosh with no
       available software? What if you were tasked with designing an
       operating system for this new (rather limited) 68000-based box? How
       would you start? Would you attempt to emulate windowing systems from
       expensive contemporaries like the Lisa and Alto? Would you take a
       simpler approach? Maybe something more like the DOSes of the day?
       
       And what about today? What features would you include? Maybe a nice,
       lightweight IP stack? (lwIP) How much functionality could you squeeze
       into a replacement ROM? I began to wonder what I could get my old Mac
       to do with the benefit of modern toolchains and open-source stacks
       and kernels. Relying on all of the available documentation, the
       original Macintosh becomes a blank slate for our imagination.
       
       This first article in my bare-metal Macintosh programming series
       describes my first steps down this road.
       
        ### A proof-of-concept demo
       
       In my search for information on the Macintosh boot process, I ran
       across this Gist [1] that outlines a method for booting arbitrary code
       on a 68k Macintosh – that is, without Mac OS. Bingo. This was the
       starting point I needed to begin exploring my thoughts on alternative
       software/firmware for the Mac.
       
       I decided my first step was to develop the simplest demo I could think
       of that would show off some of the Mac’s hardware while compact enough
       to fit into the boot sector (first 1K) of a floppy. So I set out to
       build a simple bare-metal Macintosh demo written in 68k assembly that
       displays my own smiling face on the machine’s 1-bit framebuffer.
       
       I targeted the Macintosh Plus for the extra RAM and becuase Mini vMac
       emulates it by default, but the code should run fine on the 128k/512k
       as well, now that I’m using the ROM vector to get the frambuffer
       start address. The 68000 of these original Macintosh models makes
       them feel a lot like modern embedded systems.
       
       Here’s the result:
       
 (IMG) It Works!
       
       (I used The Gimp to create my 1-bit bitmap code block)
       
       And on to how it works…
       
        ### The Code
       
       The code listing follows below. (missing only the full bitmap data)
       The necessary boot block header was borrowed from the Emile project
       and is based on the Inside Macintosh documentation.
       
       The startup code in the Macintosh ROM eventually loads the first
       1024 bytes off of the disk and this bootblock is then responsible for
       bootstrapping the OS. (or other arbitrary code such as our
       framebuffer demo)
       
       The code itself is rather simple, basic memory copy loops,
       responsible for clearing the entire display (white) and copying the
       bitmap data onto the center of the screen. You may notice that the
       Mac’s characteristic rounded corners are of course no longer visible.
       
       At the end is an endless loop, producing a subtle animation effect.
       
 (TXT) HappyJonDemoListing
       
        ### A modern toolchain
       
       Since 68k still lives on, it is an architecture well-supported by
       GCC, allowing us to use a familiar toolchain. I’ve tested several
       68k GCC toolchains built using crosstool-ng, across several host
       platforms. (most recently using my venerable chip) [2] A quick way to
       get started on Windows is to get a pre-built toolchain here. [3]
       
       A simple Makefile is responsible for producing the binary output
       that can be placed directly onto a disk, or booted directly using
       an accurate Mac emulator like Mini vMac:
       
        > cd HappyJon
        > make
        > <vMac_path>/Mini\ vMac floppy.img
       
       
        ### Clone, Build, Fork
       
       You can check out the code in my HappyJon public repository. [4]
       Feel free to clone, build and fork my code. Try it out on your own
       vintage Mac. (instructions for creating a working floppy are in
       the README) Fire up Mini vMac and see what interesting code you
       can squeeze into the boot block.
       
       I believe a mirror of the framebuffer memory block is available
       for double-buffering. It would be fun to see what animations could
       be created. Can we start a new Macintosh demoscene? ;)
       
        ### Boot sector and beyond
       
       In Part 2, I will go into the next stage of my bare-metal Macintosh
       efforts, where I begin to explore something more useful on my
       sans-MacOS Plus. To do anything meaningful I will need to break out of
       my 1024-byte boot block jail…
       
        ## Bare-metal Macintosh Programming - Part 2
            (Building blocks for an OS?)
       
        _This second installment of my “bare-metal” Macintosh Programming
        effort moves past the boot block limitations of my proof-of-concept
        demo and introduces some foundational building blocks necessary for
        doing something more useful than just painting pictures on the
        framebuffer._
       
       After getting a functional demo up and running, (see part 1) I began
       thinking of more useful solutions. If I could implement a basic
       terminal, I could use it to port a host of existing software and use
       it to build a whole new operating environment:
       
        - Port Frotz z-code interpreter for Zork/Infocom games
        - Port pforth to create a Forth operating environment (Open Firmware for 68k Macs?) ;)
        - eLua?
        - FreeRTOS demo/shell
        - uCLinux?
       
       At least initially, each of these would rely on the floppy, but
       eventually these could run directly out of ROM, if unused ROM routines
       were removed and especially if something like BMOW’s ROM-inator is
       used.
       
       First things first, though… I need to get some text on this Mac. At
       the end of this next round of hacking, I managed to get this:
       
 (IMG) Rendering Text On The Plus
       
       
       Now on to the how…
       
        ### _A note on “bare metal”_
       
       While my eventual goal might be to replace the ROM entirely, I’m
       certainly content to leverage the portions of the ROM that are
       particularly useful or expeditious. I’m not really interested in the
       GUI Toolbox functions, but the disk driver is kinda nice. My goal
       isn’t to boycott Apple’s fine code, but simply to explore the
       possibilities of an alternative operating environment for the Mac.
       So, for the record, I’ll be using at least one of those useful ROM
       routines in this post…
       
        ### >1024 Bytes
       
       We left off at the end of part one needing to overcome the confines
       of the 1024-byte boot block that the ROM startup code loads off of
       the floppy into memory automatically for us. In order to do anything
       useful, we need to use the ROM routines (floppy driver) to read the
       rest of our code from disk into memory. In this case, I only need to
       do this once, as I’m working with a Mac Plus, that can hold a full
       800K disk in RAM. This saves us the trouble of subsequent disk reads.
       (if at the expense of initial load time)
       
       For this I am again borrowing from the Emile project for its
       second-stage loader code. In the highlight below, we allocate enough
       memory to accommodate our floppy image and use the PBReadSync Toolbox
       routine to read the floppy data into the allocated RAM location and
       finally jump to it.
       
 (TXT) code_listing_2
       
        ### Building
       
       The other thing to note about executing our code in memory is to
       ensure that it is relocatable. (this was also true for our first
       demo, but less significant given its size) We want our
       compiler/linker to produce code that can be executed from any
       location in memory, now that we rely on NewPtr to set up our memory
       for us.
       
       First, we want to be using PC-relative addressing wherever we can.
       It appears that gcc for m68k has evolved a fair bit through the years,
       leaving some gaps in the documentation on some of these compiler +
       arch features. It took some trial and error and getting familiar with
       m68k-elf-objdump’s output to arrive at the right combination of flags
       to produce code that didn’t result in jumps to invalid instructions:
       
        > m68k-elf-gcc -g -o demo demo.s chars.c -nostdlib \
                 -fomit-frame-pointer -mno-rtd -m68000 -msoft-float -mpcrel
        > m68k-elf-objcopy -O binary demo floppy.img
       
       There are several things going on here, and I’m sure I can’t explain
       them all properly, but basically, I’m making sure that gcc is
       producing code that is safe for a pure 68000 CPU (without FPU or MMU)
       and uses PC-relative addressing.
       
       (It turns out this part was more complicated than it seemed at first,
       as gcc was producing absolute address jumps when linking to newlib.
       I ended up addressing this another way with a new linker script and
       memory allocation method. I will try to spend some time describing
       this in part 3.)
       
        ### Output
       
       Since I had already started on the output side with my framebuffer
       graphics demo, the next logical step was to begin thinking of ways to
       get text onto the Mac’s screen. One of the things I use my real Mac
       Plus for is a serial terminal using ZTerm, but even with the smallest
       available font, the maximum terminal size is smaller than I’d prefer.
       So now that I have control over the complete display, (no system menu
       or windowing elements) I want to choose a condensed font that will
       make the most of the Mac’s meager 512x342 resolution.
       
        ### A Condensed Font
       
       After some searching, I came across Christian Neukirchen’s 5x13 font.
       This font seemed like the right mix of efficient and readable. It
       should yield an effective terminal size of 102x26, (512 / 5 = 102, 
       342 / 13 = 26) more than adequate for my needs.
       
       I used bdfe to convert the .bdf font data into a C header file
       suitable for use with my gcc project. bdfe generated each printable
       ASCII character in order (through code 127), making it easy to map
       ASCII codes to font data. (bdfe exported the data as 5x16 arrays,
       and I simply ignore the last 3 lines of pixel data.)
       
        ### Character Routine
       
       In order to render the font data on screen, I began by adapting my
       previous framebuffer memcopy loop for this font data and started
       working on the x/y arithmetic for the “terminal”. The first version
       of this routine wrote character data to the display at byte
       boundaries for simplicity, making for an effective terminal size of
       only 64x26. (512 / 8 = 64) This meant my nice 5x13 font had generous
       spacing.
       
       In order to condense this down, I needed some new arithmetic,
       bit-shifting and logical or’ing tricks to get these characters to
       “share” bytes nicely. Again, the result is a 102x26 character
       terminal. Here’s a shot of that condensed terminal filled up with a
       pangram for illustration:
       
       The resulting draw_char routine is a fairly efficient (I’m sure it
       could be more so) bit of assembly that draws an ASCII character at
       a given X/Y location:
       
 (TXT) draw_char code listing
       
        ### The Strings Section
       
       What good is a draw_char without a draw_string? The next obvious step
       was to implement a routine that would print a string to a given X/Y
       location using a pointer to a null-terminated string:
       
 (TXT) draw_string code listing
       
        ### Next Steps
       
       Ok, so this seems like as good a place as any to pause and summarize.
       This represents a fair amount of work (especially for someone that
       hadn’t written 68k assembly before this project) and a necessary step
       towards my goal to do really anything useful. In the next installment,
       I’ll describe some of the challenges in getting GCC to generate
       relocatable code, along with my efforts to link these output routines
       to newlib’s stubs for terminal output, which allows us to use handy
       functions like puts().
       
       …so stay tuned for part 3:
       
        - GCC linker scripts and relocatable code
        - Mixing C and assembly (calling conventions!)
        - Newlib port/implementation
        - Keyboard input routine?
       
       And in the meantime, please feel free to checkout out the repository
       and join in the fun. ;)
       
       ---
       [1] https://gist.github.com/kmcallister/3236565ed7eb7b45cf99
       [2] http://getchip.com/
       [3] http://gnutoolchains.com/
       [4] https://github.com/jrsharp/HappyJon