hngopher.com

       [HN Gopher] Advent of Code 2024 in pure SQL
       ___________________________________________________________________
        
       Advent of Code 2024 in pure SQL
        
       Author : greghn
       Score  : 203 points
       Date   : 2025-01-02 19:20 UTC (3 hours ago)
        
 (HTM) web link (databasearchitects.blogspot.com)
 (TXT) w3m dump (databasearchitects.blogspot.com)
        
       | mbowcut2 wrote:
       | Absolutely horrifying. Well done.
        
         | teeray wrote:
         | "Thanks, I hate it"
        
       | sceadu wrote:
       | for those who don't know, the author is one of the best DB
       | researchers in the world
        
         | beoberha wrote:
         | Thomas Neumann doing Thomas Neumann things
        
       | scop wrote:
       | I reacted to this title the way I react a new menu item at Taco
       | Bell: a strange mixture of desire, shame, and admiration for
       | human ingenuity.
        
         | qsort wrote:
         | I work a lot with databases and I've seen... stuff. It's not as
         | bad as you might think if you know what you are doing. Most
         | RDBMSs support recursive CTEs, it feels like writing Prolog
         | with a slightly sadistic syntax. For something like AoC the
         | most difficult part is probably parsing the input.
        
           | dunham wrote:
           | Speaking of parsing, back around y2k we were building an app
           | that used XML everywhere, which was the style at the time,
           | and our DBA wanted to write an xml parser in SQL (the api
           | would involve sending XML to the database). That got vetoed.
           | 
           | IMO, this kind of thing is what AoC is good for - you get to
           | play with weird/obscure stuff without affecting your day job
           | code.
        
             | jamie_ca wrote:
             | I did something with JSON back before there was reasonable
             | native support - it's certainly not robust, but it handled
             | a few syntax variants for a use case where we had an extra
             | attribute column that serialized JSON, and wanted to
             | surface one of the fields as a proper column on the table.
             | 
             | https://blog.tracefunc.com/2011/11/19/parsing-json-in-sql-
             | ht...
        
         | brettgriffin wrote:
         | I'm as equally amazed by the solutions in this post's github
         | repo as I am with Taco Bell's new chicken nuggets.
        
         | larodi wrote:
         | but why? what would make you react at human ingenuity with
         | shame and desire? is this something about you or something
         | about them in particular? isnt the whole of HN about human
         | ingenuity...?
         | 
         | are we to feel Taco Bell menu about it all, what am I missing?
        
       | CommanderData wrote:
       | I'm angry but in a good way.
        
       | SoftTalker wrote:
       | Over my career I've certainly written more SQL than any other
       | type of code. Not so much in the last five years so I'm sure I've
       | lost some of it, but I used to really enjoy it. Once you stop
       | thinking iteratively and start thinking in set operations it
       | becomes quite natural and powerful.
        
         | fifilura wrote:
         | I second all of that!
         | 
         | I wish more people would see the beauty. After a session of
         | SQL, when I take a step back and think.
         | 
         | "Hold on. What I have been doing lately is just pure logic. No
         | library dependency resolution, no concurrency problems (even
         | though massive concurrency is certainly under the hood). No
         | mutability issues. Just logic."
         | 
         | SQL obviously has its warts, some of them serious, like
         | testability. But at the end of the day, I wish all programming
         | was like that. Let the computer decide how to do stuff under
         | the hood. And let the human focus on the logic.
         | 
         | I have somewhat half-assed tried to read up on Prolog for
         | trying to take it to the next level, but failed sofar
         | unfortunately. (It was also a goal to try to unlearn some SQL
         | to avoid getting stuck in some local optimum). Maybe somewhere
         | between SQL and Prolog is the future of programming.
        
           | Rendello wrote:
           | > Maybe somewhere between SQL and Prolog is the future of
           | programming.
           | 
           | Must be Datalog then ;)
        
           | coliveira wrote:
           | Prolog is very powerful, if you see what professionals can do
           | with it it's eye opening. Unfortunately, it takes a complete
           | relearning of programming to achieve that proficiency level.
           | And after you reach it, you probably cannot use it in your
           | day job...
        
           | larodi wrote:
           | > Maybe somewhere between SQL and Prolog is the future of
           | programming.
           | 
           | it was, it most probably is
        
         | bob1029 wrote:
         | Over the years I have been pushing more and more
         | responsibilities into the RDBMS. I now see things mostly in
         | terms of ETL, SQL and schema. Virtually every conversation I've
         | ever had about the application of technology to the business
         | could be expressed in these terms. Business logic defined as
         | SQL queries can be quite intuitive when the schema is
         | structured well and aligned to the business stakeholders'
         | perspectives.
         | 
         | Code, frameworks, ORMs, "best practices", patterns, et. al. are
         | ultimately a distraction. There are a million ways to get the
         | data in & out of the database. Moving the bits around is _not_
         | valuable. There are many overblown software solutions out there
         | that could have been a simple merge statement or CSV import
         | job.
         | 
         | I think that a lot of the misconceptions and bad moods about
         | SQL come out of being forced to work with nasty schemas. The
         | language itself is really domain specific. Consider that one
         | wouldn't complain as much about a super fucked up nested query
         | (and resulting SQL syntax woes) if it wasn't necessary to write
         | such a query in the first place. Aligning tuples and relations
         | to the way the business typically talks about them means you
         | will be less likely to be fighting these things over time.
         | Often, it isn't possible to refactor the schema from zero, but
         | you can put replicas/views _around_ a  "bad" schema and target
         | it with your new development & refactors.
        
           | fifilura wrote:
           | > Consider that one wouldn't complain as much about a super
           | fucked up nested query (and resulting SQL syntax woes) if it
           | wasn't necessary to write such a query in the first place.
           | 
           | And in "modern" SQL this is solved with CTEs. Use them to
           | unwrap the schema the way you want it first, before
           | implementing the business logic.
        
         | crazygringo wrote:
         | > _Once you stop thinking iteratively and start thinking in set
         | operations it becomes quite natural and powerful._
         | 
         | I dunno... I've written a tremendous amount of SQL, and I still
         | have to think imperatively (iteratively) in order to write
         | queries that are actually performant, and to know which indexes
         | need to exist.
         | 
         | It would be wonderful if I could just think in terms of set
         | operations, but that tends to result in queries that take 5
         | minutes to execute rather than 5 milliseconds.
         | 
         | My entire thought process is basically -- what table do I start
         | with, what rows in what order, joining to what, under what
         | conditions, aggregating how, rinse and repeat... It's entirely
         | a mental model of loops and aggregation, never of set
         | operations.
        
           | fifilura wrote:
           | It may be true, until you do your ETL in an index-less
           | database such as BigQuery or Trino. Postgres will always be
           | faster for optimized, end user serving, queries.
           | 
           | But BigQuery allows you to scale it to 100s of CPUs without
           | having to worry about indexes.
        
             | crazygringo wrote:
             | Yes, I'm talking about end user queries. Not reports that
             | take 2 hours to run.
             | 
             | But even with BigQuery, you've still got to worry about
             | partioning and clustering, and yes they've even added
             | indexes now.
             | 
             | The only time you really just get to think in sets, is when
             | performance doesn't matter and you don't mind if your query
             | takes hours. Which maybe is your case.
             | 
             | But also -- the issue isn't generally CPU, but rather
             | communications/bandwidth. If you're joining 10 million rows
             | to 10 million rows, the only things that matter are whether
             | those 10 million rows are on the same machine, and whether
             | you're joining on an index. Adding more CPU's is generally
             | going to be analogous to the famous quote of trying to get
             | 9 women to have a single baby in one month...
        
         | maCDzP wrote:
         | SQL clicked once I read the original paper and it explained it
         | terms of sets.
        
         | griomnib wrote:
         | Being able to master the theoretical, practical, and skill-
         | based components of designing a good database schema is the
         | absolute truest test of understanding any systems design.
         | 
         | People skip ahead to all kinds of nonsense; but most of
         | software engineering is putting the right data into the right
         | format, and moving it around reliably.
         | 
         | I just did a major refactor of a complex distributed code base.
         | I pretty much only count the work I did on schema re-design as
         | the actual "job", the rest was many hours of coding, but that's
         | really just implementation.
         | 
         | There are other ways to define schema than SQL of course, but
         | it's really the perfect way to learn true systems engineering.
        
       | datadrivenangel wrote:
       | The data parsing must have been painful. High wizardry.
        
         | fifilura wrote:
         | The data parsing is just one recursive query to read one line
         | at a time, and it is done pretty much the same way in each
         | problem.
        
         | taberiand wrote:
         | I feel like it'd be entirely acceptable to load the inputs into
         | tables first and still qualify as pure sql, because string
         | parsing in sql is so blergh
        
           | fifilura wrote:
           | Input in AoC is always well-formed. And you can always use a
           | regexp? Seems like the smallest problem to me. As soon as you
           | get past that recursive one row per line trick in the posted
           | solutions
        
             | taberiand wrote:
             | Obviously it's possible, particularly in this context, but
             | generally speaking I'd rather avoid it.
        
       | ikiris wrote:
       | This is impressive. Well done. I hate it
        
       | drewbitt wrote:
       | Insanity. I wonder how much cleaner it could look in EdgeQL.
        
       | RaftPeople wrote:
       | I wrote a cubing/containerization system in SQL, but it did use
       | the additional capabilities in a sproc for looping etc., so not
       | just a single SQL stmt.
       | 
       | But the problem actually mapped well to SQL's capabilities in
       | some ways. The basic algorithm was:
       | 
       | 1-Loop through items in descending size order
       | 
       | 2-Test X different positions and orientations such that one
       | corner is butted up against an existing corner (of container, or
       | previously placed items, etc.)
       | 
       | 3-Choose the winner based on some sort of point systems that
       | encapsulates some set of rules (e.g. preferred orientation, or
       | minimizes small gaps between sides of items, etc.)
       | 
       | These were some aspects that lined up well with SQL:
       | 
       | A-Testing of X number of positions and orientations all with a
       | relative simple statement, using as input: existing corner points
       | from previously placed items, some control table with the
       | variations in X,Y,Z orientations.
       | 
       | B-The ability to maintain a set of previous options for each
       | iteration (e.g. item #2 was tested in these positions with these
       | stats and all are temporarily being considered reasonable
       | options), add new item to each one of those previous tests and
       | accumulate stats. It was pretty easy to configure a "window" of
       | how many different open options were being tracked to optimize
       | the result without overwhelming memory. The SQL to include the
       | options was almost the same as if there was only one previous
       | chosen option.
       | 
       | Some aspects were a bit painful to shift the model mentally to
       | fit in with SQL.
        
       | trjordan wrote:
       | If you like this kind of degeneracy, I tried AoC in Google Sheets
       | this year.
       | 
       | I only made it to Day 6, and not even both stars every day. I'm
       | pretty confident my Day 7 solution is correct, but I hit per-cell
       | character limits on longer inputs.
       | 
       | Enjoy :) oh but don't open it on mobile, some sheets crash the
       | app
       | 
       | https://docs.google.com/spreadsheets/d/10FY-89y19tnRM_EAAnCd...
        
       | 1st1 wrote:
       | FWIW I tried doing AoC for a few days with EdgeQL, and the
       | experience proved to be quite interesting.
       | 
       | Some of my tweets (I should write a blog post):
       | 
       | - https://x.com/1st1/status/1864069589245858083
       | 
       | - vs SQL https://x.com/1st1/status/1864412869108092997
        
       | hobs wrote:
       | When you realize recursive SQL allows iterative calculation then
       | you can pretty much do whatever you want, no matter how bad.
        
       | pearjuice wrote:
       | Doing it in pure SQL is really impressive but I think the real
       | tell-tale sign of peak "cracked engineer energy" is the
       | maintained, decade-old blogspot site. Can't exactly put my finger
       | on it, but really gives off "niche mastery". I don't even know
       | the authors but I'm sure in the right circles a few dudes
       | maintaining a blogspot site called "database architects" for a
       | decade probably don't need an introduction.
        
         | lelandfe wrote:
         | Default styling, one sentence description that just repeats the
         | site title. 0 SEO work, just post after post after post.
         | Definitely.
        
       | forgot_old_user wrote:
       | haha I couldn't help but silently mouth "BUT WHHYY" before I
       | clicked this link
        
       | pavel_lishin wrote:
       | I've been slowly working through AoC2024 in Google Sheets. I
       | should really write up what I've been up to.
        
       ___________________________________________________________________
       (page generated 2025-01-02 23:00 UTC)