[HN Gopher] BBC BASIC raytracer in 432 characters
___________________________________________________________________
BBC BASIC raytracer in 432 characters
Author : neffo
Score : 184 points
Date : 2024-01-17 03:53 UTC (19 hours ago)
(HTM) web link (mastodon.me.uk)
(TXT) w3m dump (mastodon.me.uk)
| LeoPanthera wrote:
| In the replies underneath, someone ran it on a real Acorn
| Electron computer (6502 at 1 MHz), where it completed in 8 hours
| and 40 minutes. Not too bad really.
| kragen wrote:
| that's astounding; i would have guessed months or years
| LeoPanthera wrote:
| BBC BASIC is surprisingly efficient for an interpreted
| language. It kind of had to be, running on processors that
| slow.
| KerrAvon wrote:
| Some good discussion of that on The BBC BASIC wiki entry:
| https://en.wikipedia.org/wiki/BBC_BASIC (paragraph
| beginning "Due to a number of optimizations[...]".
| crimbles wrote:
| Also you could swap out hot spots with assembly trivially
| as it had an inline assembler. And you had indirect
| pointers in it!
| mavhc wrote:
| The OP's code has REM based assembly it in to save space
| kragen wrote:
| is that assembly or is that the dithering table
| alphabetter wrote:
| Yes, the REM is the lookup table used in line 70.
|
| Because BBC Basic had a built-in assembler it was pretty
| uncommon for BBC programs to inline machine code as raw
| data (unlike some other computers from BITD).
| londons_explore wrote:
| just a table of constants.
| kragen wrote:
| all the basics i used on processors that slow were
| surprisingly inefficient, even for an interpreted languages
|
| i thought they had to be, running on processors with that
| little memory. though later on i learned about forth, which
| _is_ surprisingly efficient for an interpreted language
|
| a more likely explanation is that sophie wilson was just a
| better hacker than bill gates and paul allen
| iainmerrick wrote:
| Yeah, BBC BASIC is _really_ good. Both the language
| design and the implementation.
|
| I loved it at the time, and the more I think back on it
| the more impressed I am. Like it had a very decent suite
| of floating-point routines, which if I remember right
| were very performant. In a 32KB ROM!
| forinti wrote:
| Basic was actually a 16KB ROM. MOS occupied another 16KB
| ROM.
| crq-yml wrote:
| Some of the improvement in BBC BASIC over Microsoft's
| BASIC is attributable to the larger ROM(the first Altair
| BASIC was just 4k), but it's also that the BBC Micro's
| 6502 and DRAM was clocked higher than contemporaries.
| It's just a faster, more refined 8-bit machine all
| around.
| bunabhucan wrote:
| I recreated a mandlebrot zoom I saw in a library book in bbc
| basic and the last image took two weeks. I had 8 of them
| saved on a floppy and that would slideshow them for a 2fps
| "movie"
| kragen wrote:
| that is fucking awesome, you are a legend
| PlunderBunny wrote:
| I'm running this on a BBC micro emulator running at period-
| accurate speed, and it's taking just under a minute to draw one
| line. So I think it's going to finish in about 4 hours, which
| would make sense - approximately twice as fast as the Acorn
| Electron.
| PlunderBunny wrote:
| Update: Finished in only 2 hours 30 minutes.
| tom_ wrote:
| The Electron runs extra slowly in MODE 1 as the video
| system needs 100% of the RAM bandwidth to form the display.
| The CPU essentially ends up paused during the visible
| portion of each scan line, so the effective speed is
| probably more like 0.6 MHz.
| Sharlin wrote:
| I wonder how much of that it spent just computing those square
| roots.
| araes wrote:
| Does anybody happen to know what SQR algorithm BBC Basic
| uses? http://www.bbcbasic.co.uk/bbcwin/manual/bbcwin7.html#pr
| intha... does not seem to have the actual algorithm.
|
| Intel's optimization manual has suggestions for fast versions
| at 15.12.3 (recommended by @James
| https://stackoverflow.com/a/2637823/10981777) Don't look
| especially long to implement for 22-bit approximation.
|
| https://software.intel.com/content/www/us/en/develop/downloa.
| ..
|
| Also, what's the SQRD? Not finding that referred to anywhere.
| Sharlin wrote:
| SQRD is just SQR D :)
| araes wrote:
| Thanks. So much more of that code makes sense now. IFD,
| SGNU, FORN, FORM. FORM especially. So many codes now have
| the word FORM as a special word.
|
| What a nightmare to read though if you don't know what
| it's supposed to be written like. Is it a: "FORM=" or a
| "FOR M=" ? Is it: "S GNU:G", "SG NU:G", "SGN U:G"? SQRD
| totally looks like some special kind of square root
| implementation.
| Sharlin wrote:
| Yeah, early BASICs tended to let you skip the spaces, and
| in fact it sometimes did have a measurable effect on
| parsing/execution speed. And of course on file size
| too... Additionally, at least C64 Basic had two-character
| shorthands for all keywords.
| kragen wrote:
| this says `goto 50` but there are no line numbers
| LeoPanthera wrote:
| The robot automatically adds line numbers in increments of 10s.
| EspadaV9 wrote:
| If you go to the source link - https://bbcmic.ro/?t=9ctpk -
| then you can see the numbers with the lines. They increment in
| steps of 10 (which IIRC you could override manually with
| numbers in-between).
| PlunderBunny wrote:
| If you're playing with this on a BBC micro emulator, the best
| way to get this code into the emulator is to download the
| disk image from the above link. The actual basic program is
| named "PROGRAM" on the disc. List the code by typing LIST20,
| (the computer will lock up if you try to list line 10).
|
| If you just want to see the result, type MODE 1 and then
| press return, then type *LOAD SCREEN and press return again.
| kragen wrote:
| thanks! i didn't see that link. but i think the 432 bytes
| leaves the line numbers out, so it should actually be 448
| bytes
| flohofwoe wrote:
| I guess that's a philosophical question.
|
| In most home computer BASIC interpreters, the source code
| (including line numbers) never exists as text in memory or
| on tape/disk, only as binary tokens. The only place where
| the original input text exists is a small line buffer which
| gets translated into binary token form when pressing
| Enter/Return.
|
| The LIST command translates the token form back into human
| readable text.
|
| And if you use 'AUTO' you also never type any line
| numbers... so are they actually part of the source code? Is
| there even 'source code' when the original source is never
| stored as text anywhere? Is the 'source code size' then the
| size of the original text data (that actually never exists
| a whole), or the size of binary token representation in
| memory and on tape/disk? :)
| kragen wrote:
| sure, you might want to count the tokenized
| representation instead of the printed representation, so
| maybe the real number is something smaller than 432, but
| it contains the line numbers too. you definitely don't
| want to count the 432 bytes of printed representation and
| leave out the 16 bytes of line numbers because the
| program won't run without them
| ManuelKiessling wrote:
| Ok, this is probably a dumb question: I get that the code traces
| the rays for this scene, obviously -- but where is the scene,
| then? That is, where is the "data" (as opposed to the algorithm)
| that says "there's a sphere at coordinates x,y,z with radius r,
| and here's the other one, and here's the checkerboard plane"?
| sp332 wrote:
| W=1/SQR(U*U+V*V+1)
|
| This makes a sphere. It gets multiplied by two coordinates to
| make spheres at different places, U and V.
| kragen wrote:
| i think this is incorrect, i think that's computing the z
| component of the initial direction vector of the ray being
| traced
|
| if you change the initialization of i from sgn u to 1.2 * sgn
| u you will get an image with the spheres a little further
| apart
| Sharlin wrote:
| Yeah, that's something of a red herring. The actual ray-
| sphere intersection happens on line 50:
| // sphere_center = (E, F, Z) E = X - I F =
| Y - I // solving the ray-sphere quadratic eqn...
| P = U * E + V * F - W * Z // discriminant D
| = P * P - E * E - F * F - Z * Z + 1 // if solution
| (ie. intersection) exists IF D > 0 //
| intersect_point = ray_orig + T * ray_dir // this
| is the closer pt, -P + SQR D is the other T = -P
| - SQR D
| kragen wrote:
| yeah, that was what i guessed it was doing, but i
| couldn't remember or rederive the ray-sphere intersection
| formula to be sure
| Sharlin wrote:
| It's mostly implicit. See how the spheres are symmetric about
| the centerpoint? (Well, almost, the camera is at Y = -0.1.) The
| (X, Y) coordinates of their centers are simply (-1, -1) and (1,
| 1), given by the `I = SGN U` variable at the end of line 40.
| The Z is implicitly zero. Their radius is 1. The plane is
| basically defined by `P = Y - 2` at line 60.
| Sharlin wrote:
| ...to elucidate a bit, when tracing the rays on the left side
| of the image, the sign of u is -1, so we're trying to
| intersect a sphere at (x-1, y-1, z), and on the right side
| similarly one at (x+1, y+1, z). This works because none of
| the left-side rays can even in theory hit the right-side
| sphere and vice versa. It's a very simple version of space
| partitioning optimization. And when tracing the reflected
| rays, we flip the sign of i, because there's no self-
| reflection - the left-hand sphere can only reflect the right-
| hand one and vice versa.
| kragen wrote:
| these comments were extremely helpful, thank you
| Sharlin wrote:
| No problem, this was a fun puzzle to solve =D
| flohofwoe wrote:
| The 3D scene is described by formulas instead of data. It's a
| bit similar to how SDF demos describe a 3D scene by combing
| shape formulas into a single formula for the whole scene (SDF =
| signed distance fields).
|
| E.g. see:
|
| https://iquilezles.org/articles/distfunctions/
| kragen wrote:
| while i know you didn't say this was using an sdf, i want to
| point out that it doesn't seem to be using an sdf, just in
| case anyone is wondering; it seems to be a conventional
| whitted-style raytracer that computes sphere-ray
| intersections in closed form by solving the quadratic
| kragen wrote:
| sharlin has explained how the sphere coordinates are computed;
| the checkerboard plane is implemented by converting downward-
| pointing vectors (where the y-component v is negative) into
| upward-pointing vectors toward a point in the sky with the
| right color IF V < 0 then P = (Y + 2)/V : V =
| -V*((INT(X - U*P) + INT(Z - W*P) AND 1)/2 + .3) + .2
|
| then the (palette index of?) the color of the sky is computed
| from the square root of the y-coordinate of the direction
| vector as GCOL 0, 3 - (48*SQR V) DIV 16
|
| plus a fixed dithering table
| ChrisArchitect wrote:
| This one was from 2021.
|
| Do enjoy the insane things ppl managed to cram into tweets to the
| bbcmicrobot over the years. Anyone who's followed any part of the
| demoscene especially the classic smaller payload demos.. 4k
| etc.....this kinda thing is mindblowing and totally in the same
| spirit.
| po wrote:
| Also, don't miss the gallery site! https://www.bbcmicrobot.com
| neffo wrote:
| the gallery site is amazing in and of itself. the gallery isn't
| PNGs or GIFs - it's generated in the browser from the emulator
| state. if you hover over an item it resumes the emulator.
|
| https://mastodon.me.uk/@bbcmicrobot/111760484438439751
| jug wrote:
| This effort is truly specular!
| rwmj wrote:
| https://www.pouet.net/prod.php?which=78045
|
| Incredible raytraced demo in 256 bytes.
| briansm wrote:
| https://www.pouet.net/prod.php?which=53871
|
| https://www.youtube.com/watch?v=36BPql6Nl_U
|
| Incredible raytraced demo in 128 bytes. (An underwater journey
| through a Menger-sponge fractal)
| kragen wrote:
| last year i wrote a tetris in arm assembly
| https://asciinema.org/a/622461 and was pleased to get it down
| below 1024 bytes
|
| then i looked and rrrola's 4is256
| https://www.pouet.net/prod.php?which=29286 is a working tetris
| in under 256 bytes, and it's better than mine because it has
| colors and scoring. i could blame a little bit of inflation on
| arm being 32-bit rather than 16-bit, but not 4x
| bbojan wrote:
| You could try using Thumb instructions, they are 16 bit.
| kragen wrote:
| i was, but i wasn't talking about the size of the
| instruction word, but the size of the registers
| userbinator wrote:
| The upper limit of instruction density on x86 is much higher
| than any RISC.
| kragen wrote:
| that hasn't been my experience with instructions i've
| written myself, but then again, i'm not rrrola
| SillyUsername wrote:
| Can anybody provide de-obfuscated/minified source so those of us
| who understand the principle (tracing rays of light around a
| scene) can understand the maths?
| kragen wrote:
| here's my attempt 10 REMAECG$C*Ec!ae-e'd
| rem dithering table in comment above consulted by GCOL line?
| rem raytracer from https://bbcmic.ro/?t=9ctpk by
| https://mastodon.me.uk/@coprolite9000 rem
| deobfuscated and possibly broken by kragen javier sitaker
| (untested) 20 MODE 1 VDU 5 30 FOR N
| = 8 TO 247 : rem loop over rows (n) and columns (m) of
| pixels FOR M = 0 TO 319 40 X = 0
| : rem three-dimensional vector Y = -.1
| Z = 3 U = (M - 159.5)/160 : rem xform
| screen coords to direction vector V = (N -
| 127.5)/160 W = 1/SQR(U*U + V*V + 1) : rem
| normalize direction vector U = U * W
| V = V * W I = SGN U : rem derive
| coordinate of sphere center (i, i, 1)? G = 1
| 50 E = X - I : rem beginning of do-while loop
| F = Y - I P = U*E + V*F - W*Z
| D = P*P - E*E - F*F - Z*Z + 1 IF D <= 0
| then goto 60 T = -P - SQR D
| IF T <= 0 then goto 60 X = X + T*U
| Y = Y + T*V Z = Z - T*W
| E = X - I F = Y - I
| G = Z P = 2*(U*E + V*F - W*G)
| U = U - P*E V = V - P*F
| W = W + P*G I = -I : rem if we
| are reflecting then check other sphere now
| GOTO 50 : rem end of do-while loop, we have our escape from the
| scene geometry rem color according to
| y-component of direction vector to get sky gradient
| rem if instead y component is negative, make a checkerboard of
| pieces of sky 60 IF V < 0 then P = (Y + 2)/V : V
| = -V*((INT(X - U*P) + INT(Z - W*P) AND 1)/2 + .3) + .2
| 70 GCOL 0, 3 - (48*SQR V + ?(PAGE + 5 + M MOD 4 + N MOD
| 4*4)/3) DIV 16 80 PLOT 69, 4*M, 4*N
| NEXT,
|
| i'm interested to hear what i got wrong
|
| see also https://news.ycombinator.com/item?id=39026517 and
| https://news.ycombinator.com/item?id=39026495
| Sharlin wrote:
| > rem if instead y component is negative, make a checkerboard
| of pieces of sky
|
| I'd maybe phrase this a bit differently - in both cases it
| uses the v coordinate to calculate the shade but whereas the
| sky is a simple gradient, the plane has a "ground fog" type
| effect that can be adjusted with those +.3 and +.2 magic
| constants.
| kragen wrote:
| i was trying to figure out how the ground fog happens, and
| i still don't
| Sharlin wrote:
| The value of v corresponds to how far away the
| intersection point is. At the horizon v=0 so the color
| value is just constant 0.2. The closer to the camera, the
| more the checker value (0 or 1) contributes. The exact
| values are probably chosen experimentally, but note the
| important property that -v*(checker/2+0.3)+0.2 is always
| between 0 and 1. Note also that 1 means black and 0 white
| here, as ultimately the value is negated at line 70.
| kragen wrote:
| a thing that puzzles me is how https://bbcmic.ro/?t=9ctpk is only
| 4x faster in emulation (about 30 seconds per scan line and so on
| the order of 2 hours for the whole image)
|
| i'm running this on a ryzen 5 3500u at 2400 megahertz. the acorn
| electron which supposedly takes 8 hours and 40 minutes is a 1
| megahertz 6502 when running from ram, roughly, 262144
| instructions per second. at 2 ipc one core of the ryzen should be
| about 4800 mips. if the emulation slowdown is 10x (10 host
| instructions to emulate one guest instruction), which is typical
| for naive interpretation, it should be about 1000x faster on this
| laptop as on the original hardware
|
| (possibly the basic is in rom and so it's closer to 524288
| instructions per second)
|
| emulation through qemu-like dynamic code generation should cut
| the 10x slowdown to 3x, so it should be 3000x faster than the
| original hardware
|
| where did that factor of 1000 go? not the javascript engine,
| surely?
|
| incidentally there is a rocket-ship button underneath the output
| screen image labeled 'send to beebjit' which runs the program in
| about a second
| flohofwoe wrote:
| If the emulator is entirely cycle correct for the entire system
| (including the video system, which is usually the most
| expensive to emulate, not the CPU), then the speedup on modern
| computers will be much less than just comparing the clock
| frequencies.
|
| Consider that each chip of a home computer system (CPU, 1..2
| IO/timer chips, audio, video, ...) needs to do 'emulation work'
| for each 1 or 2 MHz clock cycle which can add up to quite a
| number of host computer instructions (dozens to hundreds).
|
| If each chip emulator just takes around 10..20 host system
| clock cycles to emulate one emulator clock cycle, then you are
| already looking at around 100 host system clock cycles per
| emulated clock cycle for the entire home computer (in reality
| it's probably worse).
|
| Such 'vertically sliced' emulation code is also full of
| conditional branches which put a heavy burden on the host CPU
| branch predictor.
|
| ...and that's how a theoretical 1000x speedup suddenly becomes
| a 10x speedup, it's not the CPU emulation (this is usually
| cheap) but the rest of the emulated system which can be
| expensive.
|
| Different emulators use all sorts of tricks and shortcuts, but
| usually with tradeoffs like less precise emulation, or less
| 'compartmentalized' code.
|
| PS: case in point this is just the top-level per-cycle function
| in my C64 emulator, which in turn calls per-cycle-functions in
| each of the chip emulators (which may each be just as much
| code):
|
| https://github.com/floooh/chips/blob/9a7f6d659b5d1bbf72bc8d0...
|
| I'm trying to strike a balance between 'purity' (e.g. an entire
| emulated system can be 'plugged together' by wiring together
| chip emulators, just like one would build a real 8-bit computer
| on a breadboard), while still being fast enough to comfortably
| run in realtime even in browsers
| (https://floooh.github.io/tiny8bit/c64.html).
|
| It's definitely possible to implement a faster cycle-correct
| C64 emulator with a less 'pure' architecture, but it's quite
| hard to do meaningful optimizations while keeping that same
| 'virtual breadboard' architecture.
|
| ...considering all the code that runs per cycle it's actually
| amazing how fast modern CPUs are :)
| kragen wrote:
| that's an excellent point, i hadn't considered that they
| might be running this program in basic on a cycle-correct
| emulator of the bbc micro. thank you very much for explaining
| londons_explore wrote:
| The rocket ship icon seems to have disappeared. Maybe I used it
| too much, or maybe the site owner removed it? (was it
| serverside and overloaded?)
| kragen wrote:
| i had that problem when i visited via a saved url in
| tinyurl.com
| tom_ wrote:
| The Acorn Electron is a very different design, and the BBC
| Micro runs measurably faster (particularly in this case due to
| the high res graphics). The bbcmic.ro emulation will almost
| certainly be pretty much exactly the same speed as real BBC
| Micro.
|
| The Firefox profiler suggests it's spending most of its time
| doing nothing, waiting for the next
| Window.requestAnimationFrame.
| londons_explore wrote:
| In 398 characters, I can give it error diffusion dithering
| capabilities, which makes the code smaller and the output quality
| a bit better:
|
| https://bbcmic.ro/#%7B%22v%22%3A1%2C%22program%22%3A%22MODE1...
|
| It would look even better with bidirectional error diffusion, but
| that requires reading back memory which I don't know how to do.
| logicallee wrote:
| Thanks for sharing. I asked ChatGPT to analyze your code, while
| telling it what it does. How well do you think it did? Did it
| make any mistakes?
|
| https://chat.openai.com/share/6e754fa1-531d-432e-a767-8c8132...
| bmsleight_ wrote:
| Wow - I never thought of using ChatGPT for analysing code. It
| reads well - but does the author agree ?
| CamperBob2 wrote:
| I just asked it to reformat the code for better readability,
| and it guessed it was a ray tracer. I for one welcome our new
| robot^H^H^H^H^Hparrot overlords.
|
| Also tried having it generate a JavaScript version, but the
| output doesn't look right: https://chat.openai.com/share/5d76
| e32b-81a1-4b4a-a808-7682a8... I'd be inclined to suspect the
| lack of line numbers, as a first guess towards the problem.
| aardvark179 wrote:
| It makes the output better in some ways, but since it's just 1
| dimensional and there isn't really any noise in the image it
| does produce a very stripy image on the portion I let it run
| for.
| qingcharles wrote:
| I wish there was an option to run at greater than realtime!
| anovikov wrote:
| Just curious how fast could the same thing be if someone who
| knows the hardware really well, wrote it in assembly.
| aardvark179 wrote:
| There's a bunch of things you can do. Replacing all the square
| roots with a lookup table might be doable, and you could make
| the final drawing faster by not using the generic plot
| routines.
|
| You don't have a lot of memory to play with though as you only
| have 32k to start with and the frame buffer will use 20k of
| that, so I'm not sure how good a look up table you could make
| for SQR.
| Sharlin wrote:
| In a subthread, I got nerdsniped to figure out how this works,
| and for fun transcribed it ~faithfully to Rust - this one just
| outputs ASCII art instead of plotting pixels. You'll want to zoom
| out your browser as far as you can:
|
| https://www.rustexplorer.com/b#Zm4gbWFpbigpIHsKICAgIC8vIHJvd...
| maxglute wrote:
| The dither effect looks really neat on ray trace, then I
| remembered thats just your average news paper photo. Still would
| be cool to see it in a game.
| mungoman2 wrote:
| Check out the game Return of the Obra Dinn for exactly this.
| mixedmath wrote:
| This is interesting and serves as a potential "hook" for getting
| people interested in coding. I understand the BASIC appeal!
|
| In something like python, an analogous thing is turtle graphics.
| Is there something more similar (but still very basic) in python
| for something like this? A mini graphical language?
| simonklitj wrote:
| Hmm, something like graphics.py maybe?
| https://mcsp.wartburg.edu/zelle/python/ppics2/code/graphics....
| mixedmath wrote:
| Yeah, something like that. It's hard to be both expressive
| and useful.
|
| This is a small set of primitives around a tkinter canvas. I
| suppose one could expose a small set of primitives around a
| pygame too.
|
| I have to think hard about what exactly I would want.
| miohtama wrote:
| Nice! Is there a commented source code?
___________________________________________________________________
(page generated 2024-01-17 23:02 UTC)