[HN Gopher] Abusing DuckDB-WASM by making SQL draw 3D graphics (...
___________________________________________________________________
Abusing DuckDB-WASM by making SQL draw 3D graphics (Sort Of)
Author : tanelpoder
Score : 160 points
Date : 2025-04-22 13:35 UTC (9 hours ago)
(HTM) web link (www.hey.earth)
(TXT) w3m dump (www.hey.earth)
| mritchie712 wrote:
| This is a DuckDB feature that's incredibly hard for Snowflake (or
| anyone else) to copy. Running the same database client-side
| (WASM) and server-side can make for a pretty magical experience.
|
| Queries that normally take 1s to 2s can run in 25ms, so you get
| under the "100ms rule" which is very uncommon in analytics
| applications.
|
| We DuckDB server side and have experimental support for DuckDB
| WASM on the client-side at https://www.definite.app/ and
| sometimes I don't trust that a query ran because of how fast it
| can happen (we need some UX work there).
| esafak wrote:
| How does that work? Does the client clone the database at the
| beginning of the session and work with a shapshot? If so, do
| you automatically and periodically sync it?
| randomtoast wrote:
| With HTTP Range Requests, which is typically used for pausing
| and resuming large file downloads, to request specific byte
| ranges from the file. This allows you to retrieve only the
| data you need. With SQL indexes, the data returned will be
| minimal because the lookup is optimized. However, if you
| select *, you will still end up downloading the entire
| database.
| jasonjmcghee wrote:
| Parent comment isn't asking how data is requested from the
| back-end.
|
| GP comment is (seemingly) describing keeping an entirely
| client side instance (data stored locally / in memory)
| snapshot of the back-end database.
|
| Parent comment is asking how the two are kept in sync.
|
| It's hard to believe it would be the method you're
| describing and take 25ms.
|
| If you're doing http range requests, that suggests you're
| reading from a file which means object storage or disk.
|
| I have to assume there is something getting triggered when
| back end is updating to tell the client to update their
| instance. (Which very well could just be telling it to
| execute some sql to get the new / updated information it
| needs)
|
| Or the data is entirely in memory on the back end in an in
| memory duckdb instance with the latest data and just needs
| to retrieve it / return it from memory.
| immibis wrote:
| Doesn't that mean you have way more round-trips than
| necessary? Instead of asking for the row, you ask for the
| file header, the list of tables and indices, an index page,
| another index page, another index page, and a table page?
| mritchie712 wrote:
| Yes, we're still fine-tuning exactly what we cache, but a
| simple example would be:
|
| 1. user writes a `select` statement that return 20k records.
| We cache the 20k.
|
| 2. user can now query the results of #1
|
| we're also working on more complex cases (e.g. caching
| frequently used tables).
| xnx wrote:
| Impressive project, the subhead might attract even more
| attention: "Building a SQL-Powered Doom Clone in the Browser"
| robertclaus wrote:
| This is great! I did a similar project a while back to do image
| processing in a SQL database with pixels being individual
| records. It's amazing what SQL can do with the right table
| structures.
| adornKey wrote:
| Finally somebody did it! Back in the day my attempts to write a
| game in SQL were thwarted by buggy query-optimizers. They cached
| my calls to rand() way too often although documentation promised
| not to do that.
| dspillett wrote:
| _> They cached my calls to rand() way too often although
| documentation promised not to do that._
|
| For some DBs (SQL Server definitely), RAND() and similar are
| handled as if they are deterministic and so are called once per
| use. For instance: SELECT TOP 10 RAND() FROM
| sys.objects SELECT TOP 10 RAND() FROM sys.objects
|
| just returned ten lots of 0.680862566387624 and ten lots of
| 0.157039657790194. SELECT TOP 10 RAND(),
| RAND(), RAND()-RAND() FROM sys.objects
|
| returns a different value for each column (0.451758385842036 &
| 0.0652620609942665, -0.536618123021777), so the optimisation is
| per use not per statement or even per column (if it were per
| column that last value would be 0, or as close to as floating
| point arithmetic oddities allow).
|
| This surprises a lot of people when they try "... ORDER BY
| RAND()" and get the same order on each run.
|
| One workaround for this is to use a non-deterministic function
| like NEWID(), though you need some extra jiggery-pokery to get
| a 0<=v<1 value to mimic rand: SELECT TOP 10
| CAST(CAST(CAST(NEWID() AS VARBINARY(4)) AS BIGINT) AS
| FLOAT)/(4.0*1024*1024*1024) FROM sys.objects
|
| For the example of sorting, the outer cast is not needed. You
| might think just using "ORDER BY NEWID()" would be sufficient,
| but that is an undefined behaviour so you shouldn't rely upon
| it. It might work now, a quick test has just worked as expected
| here, but at any point the optimiser could decide it is more
| efficient to consider all UUIDs as having the same weight for
| sorting purposes.
| nonethewiser wrote:
| Given the first post in the blog says "not made by a [ROBOT
| EMOJI]", should I assume this one which does not have this
| message, is made by a [ROBOT EMOJI]?
|
| https://www.hey.earth/posts
| NitpickLawyer wrote:
| I swear we're gonna start seeing disclaimers like "100%
| handcrafted code, our devs eat only grass-fed beef, free-range
| devops teams, specialty coffee sustained QA department, no
| IDEs, no intelisense, we code in notepad++" soon...
| andhuman wrote:
| 100% organic!
| bstsb wrote:
| it's the footer at the bottom of all pages. it's also present
| on the blog pages
| marcellus23 wrote:
| This one also has that same footer.
| pjot wrote:
| Ha! I made this. I'm not a robot either :)
| enescakir wrote:
| Like running Doom on a printer, but now it's in the same engine
| powering your BI dashboards. Peak 2025 energy.
| cess11 wrote:
| Nice project. Reminds me of one of my favourite demos, a MySQL
| raytracer:
|
| https://www.pouet.net/prod.php?which=83222
| datadrivenangel wrote:
| Abusing databases is the way. I'm building a SQLite agent using
| triggers.
| intalentive wrote:
| I've been using triggers as FTS boilerplate for so long I
| didn't pause to think that they're just event listeners. Say
| more about your project..?
| dkga wrote:
| Very interesting!
|
| You know it gets wild when you read "... Here's the core of the
| raycasting algorithm in SQL"!
| gitroom wrote:
| i think this is super wild honestly, cant believe sql is doing
| graphics now
| karmakaze wrote:
| I'd like to see something like this done in SpacetimeDB which was
| made specifically for game backends. I haven't looked into it
| yet, only seen the 1.0 announcement on HN and in my YT feed, and
| curious how its feature set makes this sort of thing easier or
| more natural.
| pjot wrote:
| Author here, wild to see this at the top of HN!
|
| You can play it here: https://patricktrainer.github.io/duckdb-
| doom/
|
| Pressing "L" enables (very) verbose logging in the dev console
| and prints much of the sql being executed.
| r3tr0 wrote:
| We use duck db wasm to make live system performance dashboards
| based on eBPF.
|
| It really is magic!
|
| You can check it out here.
|
| https://yeet.cx/play
| dndn1 wrote:
| Neat UI, are you using a library for that?
| r3tr0 wrote:
| nope. everything off shelf was too slow.
| robertsdionne wrote:
| https://x.com/geocucu_t/status/1909291486367166717
| kevingadd wrote:
| > But because tick() and render() involved async database calls,
| sometimes a new interval would fire before the previous one
| finished.
|
| This is a tricky one when writing games using async APIs. The
| game I've been working on is written in C# but I occasionally hit
| the same issue when game code ends up needing async, where I have
| to carefully ensure that I don't kick off two asynchronous
| operations at once if they're going to interact with the same
| game state. In the old days all the APIs you're using would have
| been synchronous, but these days lots of libraries use
| async/await and/or promises and it kind of infects all the code
| around it.
|
| It does depend on the sort of game you're building though. Some
| games end up naturally having a single 'main loop' you spend most
| of your time in, i.e. Doom where you spend all your time either
| navigating around the game world or looking at a pause/end-of-
| stage menu - in that case you can basically just have an
| is_menu_open bool in your update and draw routines, and if you
| load all your assets during your loading screen(s), nothing ever
| needs to be async.
|
| Other games are more modal, and might have a dozen different
| menus/scenes (if not hundreds), i.e. something like Skyrim. And
| sometimes you have modals that can appear in multiple scenarios,
| like a settings menu, so you need to be able to start a modal
| loop in different contexts. You might have the player in a
| conversation with an NPC, and then during the conversation you
| show a popup menu asking them to choose what to say to the NPC,
| and they decide while the conversation menu is open they want to
| consult the conversation log, so you're opening a modal on top of
| a modal, and any modal might need to load some assets
| asynchronously before it appears...
|
| In the old days you could solve a lot of this by starting a new
| main loop inside of the current one that would exit when the
| modal went away. Win32 modal dialogs work this way, for example
| (which can cause unpleasant re-entrant execution surprises if you
| trigger a modal in the wrong place). I'm still uncertain whether
| async/await is a good modern replacement for it.
| vd2287 wrote:
| I'm a bit new to this stuff, but SQL and 3D GRAPHICS???
___________________________________________________________________
(page generated 2025-04-22 23:00 UTC)