[HN Gopher] Advent of Code 2024 in pure SQL
___________________________________________________________________
Advent of Code 2024 in pure SQL
Author : greghn
Score : 203 points
Date : 2025-01-02 19:20 UTC (3 hours ago)
(HTM) web link (databasearchitects.blogspot.com)
(TXT) w3m dump (databasearchitects.blogspot.com)
| mbowcut2 wrote:
| Absolutely horrifying. Well done.
| teeray wrote:
| "Thanks, I hate it"
| sceadu wrote:
| for those who don't know, the author is one of the best DB
| researchers in the world
| beoberha wrote:
| Thomas Neumann doing Thomas Neumann things
| scop wrote:
| I reacted to this title the way I react a new menu item at Taco
| Bell: a strange mixture of desire, shame, and admiration for
| human ingenuity.
| qsort wrote:
| I work a lot with databases and I've seen... stuff. It's not as
| bad as you might think if you know what you are doing. Most
| RDBMSs support recursive CTEs, it feels like writing Prolog
| with a slightly sadistic syntax. For something like AoC the
| most difficult part is probably parsing the input.
| dunham wrote:
| Speaking of parsing, back around y2k we were building an app
| that used XML everywhere, which was the style at the time,
| and our DBA wanted to write an xml parser in SQL (the api
| would involve sending XML to the database). That got vetoed.
|
| IMO, this kind of thing is what AoC is good for - you get to
| play with weird/obscure stuff without affecting your day job
| code.
| jamie_ca wrote:
| I did something with JSON back before there was reasonable
| native support - it's certainly not robust, but it handled
| a few syntax variants for a use case where we had an extra
| attribute column that serialized JSON, and wanted to
| surface one of the fields as a proper column on the table.
|
| https://blog.tracefunc.com/2011/11/19/parsing-json-in-sql-
| ht...
| brettgriffin wrote:
| I'm as equally amazed by the solutions in this post's github
| repo as I am with Taco Bell's new chicken nuggets.
| larodi wrote:
| but why? what would make you react at human ingenuity with
| shame and desire? is this something about you or something
| about them in particular? isnt the whole of HN about human
| ingenuity...?
|
| are we to feel Taco Bell menu about it all, what am I missing?
| CommanderData wrote:
| I'm angry but in a good way.
| SoftTalker wrote:
| Over my career I've certainly written more SQL than any other
| type of code. Not so much in the last five years so I'm sure I've
| lost some of it, but I used to really enjoy it. Once you stop
| thinking iteratively and start thinking in set operations it
| becomes quite natural and powerful.
| fifilura wrote:
| I second all of that!
|
| I wish more people would see the beauty. After a session of
| SQL, when I take a step back and think.
|
| "Hold on. What I have been doing lately is just pure logic. No
| library dependency resolution, no concurrency problems (even
| though massive concurrency is certainly under the hood). No
| mutability issues. Just logic."
|
| SQL obviously has its warts, some of them serious, like
| testability. But at the end of the day, I wish all programming
| was like that. Let the computer decide how to do stuff under
| the hood. And let the human focus on the logic.
|
| I have somewhat half-assed tried to read up on Prolog for
| trying to take it to the next level, but failed sofar
| unfortunately. (It was also a goal to try to unlearn some SQL
| to avoid getting stuck in some local optimum). Maybe somewhere
| between SQL and Prolog is the future of programming.
| Rendello wrote:
| > Maybe somewhere between SQL and Prolog is the future of
| programming.
|
| Must be Datalog then ;)
| coliveira wrote:
| Prolog is very powerful, if you see what professionals can do
| with it it's eye opening. Unfortunately, it takes a complete
| relearning of programming to achieve that proficiency level.
| And after you reach it, you probably cannot use it in your
| day job...
| larodi wrote:
| > Maybe somewhere between SQL and Prolog is the future of
| programming.
|
| it was, it most probably is
| bob1029 wrote:
| Over the years I have been pushing more and more
| responsibilities into the RDBMS. I now see things mostly in
| terms of ETL, SQL and schema. Virtually every conversation I've
| ever had about the application of technology to the business
| could be expressed in these terms. Business logic defined as
| SQL queries can be quite intuitive when the schema is
| structured well and aligned to the business stakeholders'
| perspectives.
|
| Code, frameworks, ORMs, "best practices", patterns, et. al. are
| ultimately a distraction. There are a million ways to get the
| data in & out of the database. Moving the bits around is _not_
| valuable. There are many overblown software solutions out there
| that could have been a simple merge statement or CSV import
| job.
|
| I think that a lot of the misconceptions and bad moods about
| SQL come out of being forced to work with nasty schemas. The
| language itself is really domain specific. Consider that one
| wouldn't complain as much about a super fucked up nested query
| (and resulting SQL syntax woes) if it wasn't necessary to write
| such a query in the first place. Aligning tuples and relations
| to the way the business typically talks about them means you
| will be less likely to be fighting these things over time.
| Often, it isn't possible to refactor the schema from zero, but
| you can put replicas/views _around_ a "bad" schema and target
| it with your new development & refactors.
| fifilura wrote:
| > Consider that one wouldn't complain as much about a super
| fucked up nested query (and resulting SQL syntax woes) if it
| wasn't necessary to write such a query in the first place.
|
| And in "modern" SQL this is solved with CTEs. Use them to
| unwrap the schema the way you want it first, before
| implementing the business logic.
| crazygringo wrote:
| > _Once you stop thinking iteratively and start thinking in set
| operations it becomes quite natural and powerful._
|
| I dunno... I've written a tremendous amount of SQL, and I still
| have to think imperatively (iteratively) in order to write
| queries that are actually performant, and to know which indexes
| need to exist.
|
| It would be wonderful if I could just think in terms of set
| operations, but that tends to result in queries that take 5
| minutes to execute rather than 5 milliseconds.
|
| My entire thought process is basically -- what table do I start
| with, what rows in what order, joining to what, under what
| conditions, aggregating how, rinse and repeat... It's entirely
| a mental model of loops and aggregation, never of set
| operations.
| fifilura wrote:
| It may be true, until you do your ETL in an index-less
| database such as BigQuery or Trino. Postgres will always be
| faster for optimized, end user serving, queries.
|
| But BigQuery allows you to scale it to 100s of CPUs without
| having to worry about indexes.
| crazygringo wrote:
| Yes, I'm talking about end user queries. Not reports that
| take 2 hours to run.
|
| But even with BigQuery, you've still got to worry about
| partioning and clustering, and yes they've even added
| indexes now.
|
| The only time you really just get to think in sets, is when
| performance doesn't matter and you don't mind if your query
| takes hours. Which maybe is your case.
|
| But also -- the issue isn't generally CPU, but rather
| communications/bandwidth. If you're joining 10 million rows
| to 10 million rows, the only things that matter are whether
| those 10 million rows are on the same machine, and whether
| you're joining on an index. Adding more CPU's is generally
| going to be analogous to the famous quote of trying to get
| 9 women to have a single baby in one month...
| maCDzP wrote:
| SQL clicked once I read the original paper and it explained it
| terms of sets.
| griomnib wrote:
| Being able to master the theoretical, practical, and skill-
| based components of designing a good database schema is the
| absolute truest test of understanding any systems design.
|
| People skip ahead to all kinds of nonsense; but most of
| software engineering is putting the right data into the right
| format, and moving it around reliably.
|
| I just did a major refactor of a complex distributed code base.
| I pretty much only count the work I did on schema re-design as
| the actual "job", the rest was many hours of coding, but that's
| really just implementation.
|
| There are other ways to define schema than SQL of course, but
| it's really the perfect way to learn true systems engineering.
| datadrivenangel wrote:
| The data parsing must have been painful. High wizardry.
| fifilura wrote:
| The data parsing is just one recursive query to read one line
| at a time, and it is done pretty much the same way in each
| problem.
| taberiand wrote:
| I feel like it'd be entirely acceptable to load the inputs into
| tables first and still qualify as pure sql, because string
| parsing in sql is so blergh
| fifilura wrote:
| Input in AoC is always well-formed. And you can always use a
| regexp? Seems like the smallest problem to me. As soon as you
| get past that recursive one row per line trick in the posted
| solutions
| taberiand wrote:
| Obviously it's possible, particularly in this context, but
| generally speaking I'd rather avoid it.
| ikiris wrote:
| This is impressive. Well done. I hate it
| drewbitt wrote:
| Insanity. I wonder how much cleaner it could look in EdgeQL.
| RaftPeople wrote:
| I wrote a cubing/containerization system in SQL, but it did use
| the additional capabilities in a sproc for looping etc., so not
| just a single SQL stmt.
|
| But the problem actually mapped well to SQL's capabilities in
| some ways. The basic algorithm was:
|
| 1-Loop through items in descending size order
|
| 2-Test X different positions and orientations such that one
| corner is butted up against an existing corner (of container, or
| previously placed items, etc.)
|
| 3-Choose the winner based on some sort of point systems that
| encapsulates some set of rules (e.g. preferred orientation, or
| minimizes small gaps between sides of items, etc.)
|
| These were some aspects that lined up well with SQL:
|
| A-Testing of X number of positions and orientations all with a
| relative simple statement, using as input: existing corner points
| from previously placed items, some control table with the
| variations in X,Y,Z orientations.
|
| B-The ability to maintain a set of previous options for each
| iteration (e.g. item #2 was tested in these positions with these
| stats and all are temporarily being considered reasonable
| options), add new item to each one of those previous tests and
| accumulate stats. It was pretty easy to configure a "window" of
| how many different open options were being tracked to optimize
| the result without overwhelming memory. The SQL to include the
| options was almost the same as if there was only one previous
| chosen option.
|
| Some aspects were a bit painful to shift the model mentally to
| fit in with SQL.
| trjordan wrote:
| If you like this kind of degeneracy, I tried AoC in Google Sheets
| this year.
|
| I only made it to Day 6, and not even both stars every day. I'm
| pretty confident my Day 7 solution is correct, but I hit per-cell
| character limits on longer inputs.
|
| Enjoy :) oh but don't open it on mobile, some sheets crash the
| app
|
| https://docs.google.com/spreadsheets/d/10FY-89y19tnRM_EAAnCd...
| 1st1 wrote:
| FWIW I tried doing AoC for a few days with EdgeQL, and the
| experience proved to be quite interesting.
|
| Some of my tweets (I should write a blog post):
|
| - https://x.com/1st1/status/1864069589245858083
|
| - vs SQL https://x.com/1st1/status/1864412869108092997
| hobs wrote:
| When you realize recursive SQL allows iterative calculation then
| you can pretty much do whatever you want, no matter how bad.
| pearjuice wrote:
| Doing it in pure SQL is really impressive but I think the real
| tell-tale sign of peak "cracked engineer energy" is the
| maintained, decade-old blogspot site. Can't exactly put my finger
| on it, but really gives off "niche mastery". I don't even know
| the authors but I'm sure in the right circles a few dudes
| maintaining a blogspot site called "database architects" for a
| decade probably don't need an introduction.
| lelandfe wrote:
| Default styling, one sentence description that just repeats the
| site title. 0 SEO work, just post after post after post.
| Definitely.
| forgot_old_user wrote:
| haha I couldn't help but silently mouth "BUT WHHYY" before I
| clicked this link
| pavel_lishin wrote:
| I've been slowly working through AoC2024 in Google Sheets. I
| should really write up what I've been up to.
___________________________________________________________________
(page generated 2025-01-02 23:00 UTC)