[HN Gopher] FAWK: LLMs can write a language interpreter
       ___________________________________________________________________
        
       FAWK: LLMs can write a language interpreter
        
       Author : todsacerdoti
       Score  : 193 points
       Date   : 2025-11-21 10:28 UTC (12 hours ago)
        
 (HTM) web link (martin.janiczek.cz)
 (TXT) w3m dump (martin.janiczek.cz)
        
       | Y_Y wrote:
       | I've been trying to get LLMs to make Racket "hashlangs"+ for
       | years now, both for simple almost-lisps and for honest-to-god
       | different languages, like C. It's definitely possible, raco has
       | packages++ for C, Python, J, Lua, etc.
       | 
       | Anyway so far I haven't been able to get any nice result from any
       | of the obvious models, hopefully they're finally smart enough.
       | 
       | + https://williamjbowman.com/tmp/how-to-hashlang/
       | 
       | ++ https://pkgd.racket-lang.org/pkgn/search?tags=language
        
       | keepamovin wrote:
       | Yes! I'm currently using copilot + antigravity to implement a
       | language with ergonomic syntax and semantics that lowers cleanly
       | to machine code targeting multiple platforms, with a focus on
       | safety, determinism, auditability and fail-fast bugs. It's more
       | work than I thought but the LLMs are very capable.
       | 
       | I was dreaming of a JS to machine code, but then thought, why not
       | just start from scratch and have what I want? It's a _lot_ of
       | fun.
        
         | lionkor wrote:
         | Curious why you do this with AI instead of just writing it
         | yourself?
         | 
         | You should be able to whip up a Lexer, Parser and compiler with
         | a couple weeks of time.
        
           | epolanski wrote:
           | I'm not the previous user, but I imagine that weeks of
           | investment might be a commitment one does not have.
           | 
           | I have implemented an interpreter for a very basic stack-
           | based language (you can imagine it being one of the simplest
           | interpreters you can have) and it took me a lot of time and
           | effort to have something solid and functional.
           | 
           | Thus I can absolutely relate to the idea of having an LLM
           | who's seen many interpreters lay out the ground for you and
           | make you play as quickly as possible with your ideas while
           | procrastinating delving in details till necessary.
        
           | My_Name wrote:
           | Because he did it in a day, not a few weeks.
           | 
           | If I want to go from Bristol to Swindon, I could walk there
           | in about 12 hours. It's totally possible to do it by foot. Or
           | I could use a car and be there in an hour. There and back,
           | with a full work day in-between done, in a day. Using the
           | tool doesn't change what you can do, it speeds up getting the
           | end result.
        
             | bgwalter wrote:
             | There is no end result. It's a toy language based on a
             | couple of examples without a grammar where apparently the
             | LLM used its standard (plagiarized) parser/lexer code and
             | reiterated until the examples passed.
             | 
             | Automating one of the fun parts of CS is just weird.
             | 
             | So with this awesome "productivity" we now can have 10,000
             | new toy languages per day on GitHub instead of just 100?
        
               | TeodorDyakov wrote:
               | That was exactly my thought. Why automate the coding part
               | to create something that will be used for coding (and in
               | itself can be automated , going buy the same logic)? This
               | makes zero sense.
        
               | fragmede wrote:
               | Thank you for bringing this matter to our attention,
               | TeodorDyakov and bgwalter. I am a member of the fun
               | police, and I have placed keepamovin, and accomplice,
               | My_Name under arrest, pending trial, for having fun
               | _wrong_. If convicted, thet each face a 5 year sentence
               | to a joyless marriage for healthcare without possiblity
               | of time off for boring behavior. We take these matters
               | pretty seriously, as crimes of this nature could lead to
               | a bubble collapse, and the economy can 't take that (or a
               | joke), so good work there!
        
             | andsoitis wrote:
             | If you could also automate away the reason for being in
             | Swindon in the first place, would you still go?
        
               | thunky wrote:
               | The only reason for going to Swindon was to walk there?
               | 
               | If so then of course you still should go.
               | 
               | But the point making of a computer program usually isn't
               | for "the walk".
        
               | andsoitis wrote:
               | If you can automated away the reason for being at the
               | destination, then there's no point in automating the way
               | to get to the destination.
               | 
               | similar for automating creating an interpreter with nicer
               | programming language features in order to build an app
               | more easily when you can just automate creation of the
               | app in the first place.
        
               | int_19h wrote:
               | "Because it's a shiny toy that I want to play with" is a
               | perfectly valid reason that still applies here. The
               | invalid assumption in your premise is that people either
               | enjoy coding or don't. The truth is that they enjoy
               | coding some things but not others, and those preferences
               | are very subjective.
        
             | lionkor wrote:
             | Yes, and the result is undoubtably trash. I have yet to see
             | a single vibe-coded app or reasonably large/complex snippet
             | which isn't either 1) almost an exact reproduction of a
             | popular library, tutorial, etc. or 2) complete and utter
             | trash.
             | 
             | So my question was, given that this is not a very hard
             | thing to build properly, why not properly.
        
               | simonw wrote:
               | The choice with this kind of question is almost _never_
               | between  "do it properly or do it faster with LLMs".
               | 
               | It's between "do it with LLMs or don't do it at all" -
               | because most people don't have the time to take on an
               | ambitious project like implementing a new programming
               | language just for fun.
        
           | keepamovin wrote:
           | It would be very new to me. I'd have to learn a lot to do
           | that. And I can't spare the time or attention. It's more of a
           | fun side project.
           | 
           | The machine code would also be tedious, tho fun. But I really
           | can't spare the time for it.
        
           | TechDebtDevin wrote:
           | Because this is someone in a "spiral" or "AI psychosis" Its
           | pretty clear by how they are talking.
        
         | 64718283661 wrote:
         | What's the point of making something like this if you don't get
         | to deeply understand what your doing?
        
           | My_Name wrote:
           | What's the point of owning a car if you don't build it by
           | hand yourself?
           | 
           | Anyway, all it will do is stop you being able to run as well
           | as you used to be able to do when you had to go everywhere on
           | foot.
        
             | purple_turtle wrote:
             | What is the point of car that on Mondays changes colour to
             | blue and on each first Friday of the year explodes?
             | 
             | If neither you not anyone else can fix it, without more
             | cost than making a proper one?
        
               | ChrisGreenHeur wrote:
               | Code review exists.
        
               | bgwalter wrote:
               | Proper code review takes as long as writing the damn
               | thing in the first place and is infinitely more boring.
               | And you still miss things that would have been obvious
               | while writing.
               | 
               | In this special case, you'd have to reverse engineer the
               | grammar from the parser, calculate first/follow sets and
               | then see if the grammar even is what you intended it to
               | be.
        
               | skeledrew wrote:
               | Author did review the (also generated) tests, which as
               | long as they're comprehensive enough for his purposes,
               | all pass and coverage is very high, means things work
               | well enough. Attempting to manually edit that code is a
               | whole other thing though.
        
               | auggierose wrote:
               | That argument might work for certain kinds of
               | applications (none I'd like to use, though), but for a
               | programming language, nope.
               | 
               | I am using LLMs to speed up coding as well, but you have
               | to be super vigilant, and do it in a very modular way.
        
               | skeledrew wrote:
               | They literally just made it to do AoC challenges, and
               | shared it for fun (and publicity).
        
               | auggierose wrote:
               | I don't think that contradicts my comment in any way.
               | It's not a programming language then, it is a fun
               | language.
        
           | johnisgood wrote:
           | I have made a lot of things using LLMs and I fully understood
           | everything. It is doable.
        
           | afpx wrote:
           | How deep do you need to know?
           | 
           | "Imagination is more important than knowledge."
           | 
           | At least for me that fits. I have quite enough graduate-level
           | knowledge of physics, math, and computer science to rarely be
           | stumped by a research paper or anything an LLM spits out.
           | That may get me scorn from those tested on those subjects.
           | Yet, I'm still an effective ignoramus.
        
           | keepamovin wrote:
           | I want something I can use, and something useful. It's not
           | just a learning exercise. I get to understand it by following
           | along.
        
           | ModernMech wrote:
           | If they go far enough with it they will be forced to
           | understand it deeply. The LLM provides more leverage at the
           | beginning because this project is a final exam for a first
           | semester undergrad PL course, therefore there are a billion
           | examples of "vaguely Java/Python/C imperative language with
           | objects and functions" to train the LLM on.
           | 
           | Ultimately though, the LLM is going to become less useful as
           | the language grows past its capabilities. If the language
           | author doesn't have a sufficient map of the language and a
           | solid plan at that point, it will be the blind leading the
           | blind. Which is how most lang dev goes so it should all work
           | out.
        
             | keepamovin wrote:
             | Lol thank you for this. It's more worth I work than i
             | thought!
        
       | skydhash wrote:
       | Commendable effort, but I expected at least a demo, which would
       | showcase working code (even if it's hacky). It's like someone
       | talking about a sheet music without playing it once.
        
         | epolanski wrote:
         | Even more, it's like talking about a sheet without seeing the
         | sheet itself.
        
         | johnisgood wrote:
         | See https://github.com/Janiczek/fawk and .fawk files in
         | https://github.com/Janiczek/fawk/tree/main/tests.
        
       | slybot wrote:
       | I did AoC 2021 until D10 using awk, it was fun but not easy and
       | couldn't proceed further: https://github.com/nusretipek/Advent-
       | of-Code-2021
        
       | qsort wrote:
       | The money shot: https://github.com/Janiczek/fawk
       | 
       | Purely interpretive implementation of the kind you'd write in
       | school, still, above and beyond anything I'd have any right to
       | complain about.
        
       | artpar wrote:
       | I wrote two
       | 
       | jslike (acorn based parser)
       | 
       | https://github.com/artpar/jslike
       | 
       | https://www.npmjs.com/package/jslike
       | 
       | wang-lang ( i couldn't get ASI to work like javascript in this
       | nearley based grammar )
       | 
       | https://www.npmjs.com/package/wang-lang
       | 
       | https://artpar.github.io/wang/playground.html
       | 
       | https://github.com/artpar/wang
        
         | shevy-java wrote:
         | wang-lang? Is that a naughty language?
        
       | jamesu wrote:
       | A few months ago I used ChatGPT to rewrite a bison based parser
       | to recursive descent and was pretty surprised how well it held up
       | - though I still needed to keep prompting the AI to fix things or
       | add elements it skipped, and in the end I probably rewrote 20% of
       | it because I wasn't happy with its strange use of C++ features
       | making certain parts hard to follow.
        
       | vidarh wrote:
       | It's a fun post, and I love language experiments with LLMs (I'm
       | close to hitting the weekly limit of my Claude Max subscription
       | because I have a near-constantly running session working on my
       | Ruby compiler; Claude can fix -- albeit with messy code sometimes
       | -- issues that requires complex tracing of backtraces with gdb,
       | and fix complex parser interactions almost entirely unaided as
       | long as it has a test suite to run).
       | 
       | But here's the Ruby version of one of the scripts:
       | BEGIN {           result = [1, 2, 3, 4, 5]             .filter
       | {|x| x % 2 == 0 }             .map {|x| x * x}
       | .reduce {|acc,x| acc + x }          puts "Result: #{result}"
       | }
       | 
       | The point being that running a script with the "-n" switch un
       | runs BEGIN/END blocks and puts an implicit "while gets ... end"
       | around the rest. Adding "-a" auto-splits the line like awk.
       | Adding "-p" also prints $_ at the end of each iteration.
       | 
       | So here's a more typical Awk-like experience:
       | ruby -pe '$_.upcase!' somefile.txt ($_ has the whole line)
       | 
       | Or:                   ruby -F, -ane '$F[1]' # Extracts the second
       | field field -F sets the default character to split on, and -a
       | adds an implicit $F = $_.split.
       | 
       | That is not to detract from what he's doing because it's fun. But
       | if your goal is just to use a better Awk, then Ruby is usually
       | better Awk, and so, for that matter, is Perl, and for most things
       | where an Awk script doesn't fit on the command line the only
       | reason to really use Awk is that it is more likely to be
       | available.
        
         | UltraSane wrote:
         | So I have had to work very hard to use $80 worth of my $250
         | free Claude code credits. What am I doing wrong?
        
           | sceptic123 wrote:
           | > free
           | 
           | how do you get free credits?
        
             | throwup238 wrote:
             | They were given out for the Claude Code on Web launch. Mine
             | expired November 18 (but I managed to use them all before
             | then).
        
               | UltraSane wrote:
               | Mine were set to expire then but got extended to the 23.
        
             | UltraSane wrote:
             | Pro users got $250 and max users got $1000
        
           | throwup238 wrote:
           | I used all of my credits working on a PySide QT desktop app
           | last weekend. What worked:
           | 
           | I first had Claude write an E2E testing framework that
           | functioned a lot like Cypress, with tests using element
           | selectors like Jquery and high level actions like 'click'
           | with screenshots at every step.
           | 
           | Then I had Claude write an MCP server that could run the GUI
           | in the background (headless in Claude's VM) and take
           | screenshots, execute actions, etc. This gave Claude the
           | ability to test the app in real time with visual feedback.
           | 
           | Once that was done, I was able to run half a dozen or more
           | agents at the same time running in parallel working on
           | different features. It was relatively easy to blow through
           | credits at that point, especially since I think VM times
           | counts so whenever I spent 4-5 min running the full e2e test
           | suite that cost money. At the end of an agents run, I'd ask
           | them to pull master and merge conflicts, then I'd watch the
           | e2e tests run locally before doing manual acceptance testing.
        
           | vidarh wrote:
           | Run it with --dangerously-skip-permissions, give it a large
           | test suite, and keep telling it "continue fixing spec
           | failures" and you'll eat through them very quickly.
           | 
           | Or it will format your drives, and set fire to your cat;
           | might be worth doing it in a VM.
           | 
           | Though a couple of days ago, I gave Claude Code root access
           | to a Raspberry Pi and told it to set up Home Assistant and a
           | voice agent... It likes to tweak settings and reboot it.
           | 
           | EDIT: It just spoke to me, by ssh'ing into the Pi and running
           | Espeak (I'd asked it to figure it out; it decided the HA API
           | was too difficult, and decided on its own to pivot to that
           | approach...)
        
         | shevy-java wrote:
         | > That is not to detract from what he's doing because it's fun.
         | But if your goal is just to use a better Awk, then Ruby is
         | usually better Awk
         | 
         | I agree, but I also would not use such one liners in ruby. I
         | tend to write more elaborate scripts that do the filtering. It
         | is more work, but I hate to burden my brain with hard to
         | remember sigils. That's why I don't really use sed or awk
         | myself, though I do use it when other people write it. I find
         | it much simpler to just write the equivalent ruby code and use
         | e. g. .filter or .select instead. So something like:
         | ruby -F, -ane '$F[1]'
         | 
         | I'd never use because I wouldn't have the faintest idea what
         | $F[1] would do. I assume it is a global variable and we access
         | the second element of whatever is stored in F? But either way,
         | I try to not have to think when using ruby, so my code ends up
         | being really dumb and simple at all times.
         | 
         | > for that matter, is Perl
         | 
         | I'd agree but perl itself is a truly ugly language. The
         | advantages over awk/sed are fairly small here.
         | 
         | > the only reason to really use Awk is that it is more likely
         | to be available.
         | 
         | People used the same explanation with regard to bash shell
         | scripts or perl (typically more often available on a cluster
         | than python or ruby). I understand this but still reject it; I
         | try to use the tool that is best. So, for me, python and ruby
         | are better than perl; and all are better than awk/sed/shell
         | scripts. I am not in the camp of users who want to use shell
         | scripts + awk + sed for everything. I understand that it can be
         | useful, but I much prefer just writing the solution in a ruby
         | script and then use that. I actually wrote numerous ruby
         | scripts and aliases, so I kind of use these in pipes too, e. g.
         | "delem" is just my alias for delete_empty_files (defaults to
         | the current working directory), so if I use a pipe in bash,
         | with delem between two | |, then it just does this specific
         | action. The same is true for numerous other actions, so ruby
         | kind of "powers" my system. Of course people can use awk or sed
         | or rm and so forth and pipe the correct stuff in there, which
         | also works, but I found that my brain just can not want to be
         | bothered to remember all flags. I just want to think in terms
         | of super-simple instructions at all times and keep on re-using
         | them; and extending them if I need to. So ruby kind of
         | functions as a replacement for me for all computer-related
         | actions in general. It is the ultimate glue for me to
         | efficiently work with a computer system. Anything that can be
         | scripted and automated and I may do more than once, I end up
         | writing into ruby and then just tapping into that
         | functionality. I could do the same in python too for the most
         | part, so this is a very comparable use case. I did not do it in
         | perl, largely because I find perl just to be too ugly to use
         | efficiently.
        
           | vidarh wrote:
           | > I'd never use because I wouldn't have the faintest idea
           | what $F[1] would do.
           | 
           | I don't use it often either, and most people probably don't
           | know about it. But $F will contain each row of the input
           | split by the field separator, which you can set with -F,
           | hence the comparison to Awk.
           | 
           | Basically, each of -n, -p, -a, -F conceptually just does some
           | simple transforms to your code:
           | 
           | -n: wrap "while gets; <your code>; end around your code and
           | call the BEGIN and END blocks.
           | 
           | -a: Insert $F = $_.split at the start of the while loop from
           | a. $_ contains the last line read by gets.
           | 
           | -p: Insert the same loop as -n, but add "puts $_" at the end
           | of the while loop.
           | 
           | These are sort-of inherited from Perl. like a lot of Ruby's
           | sigils, hence my mention of it (I agree its ugly). They're
           | not that much harder to remember than Awk, and it saves me
           | from having to use a language I use so rarely that I
           | invariably end up reading the manual every time I need more
           | than the most basic expressions.
           | 
           | > I understand this but still reject it; I try to use the
           | tool that is best.
           | 
           | I do too, but sometimes you need to access servers you can't
           | install stuff on.
           | 
           | Like you I have lots of my own Ruby scripts (and a Ruby WM, a
           | Ruby editor, a Ruby terminal emulator, a file manager, a
           | shell; I'm turning into a bit of a zealot in my old age...)
           | and much prefer them when I can.
        
       | TeodorDyakov wrote:
       | So you are using a tool to help you write code because you dont
       | enjoy coding in order to make a tool used for coding(a computer
       | language). Why?
        
         | cl3misch wrote:
         | For the same reason we have Advent of Code: for fun!
         | 
         | I mean, he's not _solving_ the puzzles with AI. He 's creating
         | his own toy language to solve the puzzles in.
        
         | killerstorm wrote:
         | Coding has many aspects: conceptual understanding of problem
         | domain, design, decomposition, etc, and then typing code,
         | debugging. Can you imagine person might enjoy conceptual part
         | more and skip over some typing exercises?
        
           | bgwalter wrote:
           | The whole blog post does not mention the word "grammar". As
           | presented, it is examples based and the LLM spit out its
           | plagiarized code and beat it into shape until the examples
           | passed.
           | 
           | We do not know whether the implied grammar is conflict free.
           | We don't know anything.
           | 
           | It certainly does not look like enjoying the conceptual part.
        
             | killerstorm wrote:
             | Many established programming languages have grammatical
             | warts, so your bar for LLMs is higher than "industry
             | expert".
             | 
             | E.g. C++ `std::vector<std::vector<int>> v;`. The language
             | defined by top fucking experts, with a 1000-page spec.
        
         | victorbjorklund wrote:
         | There are lots of different things people can find interesting.
         | Some people love the typing of loops. Some people love the
         | design of the architecture etc. That's like saying "how can you
         | enjoy woodworking if you use a CNC machine to automate parts of
         | it"
        
           | doublerabbit wrote:
           | I take satisfaction in the end product of something. A
           | product where I have created it myself, with my own skills
           | and learnings. If I haven't created it myself and yet still
           | have an end product, how have I accomplished anything?
           | 
           | It's nice for a robot to create it for you but you've really
           | not gained; other than a product you're unknown to.
           | 
           | Although, how long until we have AI in CnC machines?
           | 
           | "Lathe this plank of wood in to a chair leg x by x."
        
             | ben_w wrote:
             | I take satisfaction living in a house I did not build using
             | tools I could not use or even enumerate, tools likewise
             | acting on materials I can neither work with nor name
             | precisely enough to be unambiguous, in a community I played
             | no part in before moving here, kept safe by laws I can't
             | even read because I've not yet reached that level of
             | mastery of my second tongue.
             | 
             | It has a garden.
             | 
             | I've been coding essentially since I learned to read, I
             | have designed boolean logic circuits from first principles
             | to perform addition and multiplication, I know enouhg of
             | the basics of CPU behaviours such that if you gave me time
             | I might get as far as a buggy equivalent of a 4004 or
             | something, and yet everything from there to C is a bunch of
             | here-be-dragons and half-remembered uni modules from 20
             | years ago, then some more exothermic flying lizards about
             | the specifics of "modern" (relative to 2003) OSes, then
             | apps which I actually got paid to make.
             | 
             | LLMs lets everything you don't already know be as fun as
             | learning new stuff in uni or as buying new computers from a
             | store, whichever you ask it for.
        
               | doublerabbit wrote:
               | > It has a garden
               | 
               | In this scenario your starting out as an gardener, would
               | you rather having LLM "plant me five bulbs and two tulips
               | in ideal soil conditions" or would you rather grow them
               | yourself? If the latter you wouldn't gain skills as if
               | you had the previous year made the compost, double dug
               | the soil and sowed the seeds. All this knowledge learnt,
               | skills gained and achievement that lost in the process.
               | You may be novice and it may not bring all your flowers
               | to bloom but if you succeed in one, that's the
               | accomplishment, the feel good energy.
               | 
               | LLM may bring you the flowers, but you've not attempted.
               | You've palmed the work to something else and just busking
               | in the result. I wouldn't count that being a achievement;
               | I just couldn't take pride in that. I was brought up in a
               | strict form of "cheating: your only cheating yourself"
               | ideology which may be what triggering this.
               | 
               | I would accept that on terms of teaching that there is a
               | net plus for LLM's. A glorified Liberian. A traditional
               | teacher may teach you one method - one for the whole
               | class, LLM can adjust it's explanation until it clicks
               | with yourself. "Explain it using Teddy Bears" -- a 24/365
               | resource allowing you to learn.
               | 
               | As such a LLM explaining that "your switch case statement
               | is checking if the variable is populated and not that if
               | the file is empty" on your existing written the code is
               | relaying back a fault that would be no different of if
               | you had asked a professional to review.
               | 
               | I just can't grip the feel of having LLM code for you.
               | When you do it spreads like regex; you become dependent
               | on it. "Now display a base64 image retrieved from an
               | internal hash table while checking that the rendered
               | image is actually 800x600" and that it does but the
               | knowledge how-to becomes lost. You have to put double
               | time in to learn what it did, question it's efficiency
               | and assume it hasn't introduced further issues. It may
               | take yourself few hours, days to get the logic right but
               | at least you can take a step back and look at it knowing
               | it's my code, my skills that made that single flower
               | bloom.
               | 
               | The cat is out of the bag, reality is forcing you to
               | embrace. It's not for me and that's fine; I'm not going
               | to grudge over folk enjoying the ability to experience a
               | specialist subject. I do become concerned when I see
               | dystopian dangers ahead and see a future generation
               | degraded in knowledge because we got vibe and over-hyped
               | the current.
               | 
               | Knowledge and history is in real danger.
        
               | ben_w wrote:
               | > In this scenario your starting out as an gardener,
               | would you rather having LLM "plant me five bulbs and two
               | tulips in ideal soil conditions" or would you rather grow
               | them yourself? If the latter you wouldn't gain skills as
               | if you had the previous year made the compost, double dug
               | the soil and sowed the seeds. All this knowledge learnt,
               | skills gained and achievement that lost in the process.
               | You may be novice and it may not bring all your flowers
               | to bloom but if you succeed in one, that's the
               | accomplishment, the feel good energy.
               | 
               | I am a novice in the garden. I do it because I want to,
               | because it's fun to do.
               | 
               | I don't know what does and doesn't work, and therefore I
               | am asking LLMs (VLMs) lots of questions. I am learning
               | from it.
               | 
               | But I know it is not as smart as it acts, that it will
               | tell me untrue things. I upload a photo of a mystery
               | weed, ChatGPT tells me it's a tomato, I can tell it's not
               | a tomato because of the tiny black berries, I ask around
               | on Telegram and it's a self-seeding solanum nigrum.
               | 
               | Other times, the AI is helpful:                 Me:
               | [upload a picture of the root ball of my freshly
               | purchased Thuja Brabant]            ChatGPT: That root
               | ball is severely root-bound--classic "pot-shape memory".
               | If planted as-is, the roots will continue circling,
               | restricting growth and potentially strangling the plant
               | over time.            You must correct this before
               | planting. Here's how:            ...
               | 
               | My mum was a gardener. It would be nice if I could ask
               | her. Sadly, she's spent the last few years fertilising
               | some wild flowers from underneath, which makes it
               | difficult to get answers.
        
             | linsomniac wrote:
             | >If I haven't created it myself and yet still have an end
             | product, how have I accomplished anything?
             | 
             | Maybe what you wanted to accomplish wasn't the dimensioning
             | of lumber?
             | 
             | Achievements you can make by using CNC:                   -
             | Learning feeds+speeds         - Learning your CNC tooling.
             | - Learning CAD+CAM.         - Design of the result.
             | - Maybe you are making tens of something.  Am I really
             | achieving that much by making ~100 24"x4" pieces of
             | plywood?         - Maybe you want to design something that
             | many people can manufacture.
        
               | doublerabbit wrote:
               | The CnC machine is aiding in teach, it's not doing it for
               | you. It's being used a tool to increase your efficiency,
               | learning. If you were asking the CnC machine what is the
               | best frequency and to set the speed of the spindle you're
               | still putting in your own work. Your learning the skills
               | of the machine via another method and no different as if
               | you worked with a master carpenter were asking questions.
               | 
               | An electric wheel for clay making is going to result in
               | an quicker process in making a bowl than using a foot
               | spindle. You've still need to put the effort in to get
               | the results you want to achieve but it shows in time.
               | 
               | Using LLMs for let me do this for you is where it gets
               | out of hand and you've not really accomplished anything
               | other an elementary "I made this".
        
       | badsectoracula wrote:
       | A related test i did around the beginning of the year: i came up
       | with a simple stack-oriented language and asked an LLM to solve a
       | simple problem (calculate the squared distance between two
       | points, the coordinates of which are already in the stack) and
       | had it figure out the details.
       | 
       | The part i found neat was that i used a local LLM (some quantized
       | version of QwQ from around December or so i think) that had a
       | thinking mode so i was able to follow the thought process. Since
       | it was running locally (and it wasn't a MoE model) it was slow
       | enough for me to follow it in realtime and i found fun watching
       | the LLM trying to understand the language.
       | 
       | One other interesting part is the language description had a
       | mistake but the LLM managed to figure things out anyway.
       | 
       | Here is the transcript, including a simple C interpreter for the
       | language and a test for it at the end with the code the LLM
       | produced:
       | 
       | https://app.filen.io/#/d/28cb8e0d-627a-405f-b836-489e4682822...
        
         | chrisweekly wrote:
         | THANK YOU for SHARING YOUR WORK!!
         | 
         | So many commenters claim to have done things w/ AI, but don't
         | share the prompts. Cool experiment, cooler that you shared it
         | properly.
        
           | fsloth wrote:
           | "but don't share the prompts."
           | 
           | To be honest I don't want to see anyone elses prompts
           | generally because what works is so damn context sensitive -
           | and seem to be so random what works and what not. Even though
           | someone else had a brilliant prompt, there are no guarantees
           | they work for me.
           | 
           | If working with something like Claude code, you tell it what
           | you want. If it's not what you wanted, you delete everything,
           | and add more specifications.
           | 
           | "Hey I would like to create a drawing app SPA in html that
           | works like the old MS Paint".
           | 
           | If you have _no clue_ what to prompt, you can start by asking
           | the prompt from the LLM or another LLM.
           | 
           | There are no manuals for these tools, and frankly they are
           | irritatingly random in their capabilities. They are _good
           | enough_ that I tend to always waste time trying to use them
           | for every novell problem I came face with, and they work
           | maybe 30% - 50% of time. And sometimes reach 100%.
        
             | simonw wrote:
             | "There are no manuals for these tools" is exactly why I
             | like it when people share the prompts they used to achieve
             | different things.
             | 
             | I try to share not just the prompts but the full
             | conversation. This is easy with Claude and ChatGPT and
             | Gemini - they have share links - but harder with coding
             | agents.
             | 
             | I've recently started copying and pasting my entire Claude
             | Code terminal sessions into a shareable HTML page, like
             | this one: https://gistpreview.github.io/?de6b9a33591860aa73
             | 479cf106635... (context here:
             | https://simonwillison.net/2025/Oct/28/github-universe-
             | badge/) - I built this tool for doing that:
             | https://tools.simonwillison.net/terminal-to-html
        
               | ciaranmca wrote:
               | That's why I like how OC handles sharing sessions
               | https://opencode.ai/docs/share/
               | 
               | Wish other tools would copy this functionality(and maybe
               | expand it so colleagues can pick up on sessions I share)
        
         | int_19h wrote:
         | I often wonder how people can look at a log like this and still
         | confidently state that this isn't reasoning.
        
           | quinnjh wrote:
           | It (thinking steps) has moments of brilliance, and generally
           | convincing looking steps and improved outputs. Wether that is
           | reasoning seems to be a matter of interpretation.
           | 
           | From skimming the log > After popping the 2, the stack is
           | [X11, then pushing X2 would make it [X2, X1]? No, because
           | pushing adds to the top. So after popping 2, the stack is
           | [X1],then pushing X2 adds it on top - [X2, X1].
           | 
           | Wait, no, when you push, you add to the top. So after popping
           | the 2, the stack is [X1], then pushing X2 would make it [X2,
           | X1]? No, wait, the stack is LIFO. So pushing X2 would put it
           | on top of X1 - stack becomes [X2, X1]? No, no. Wait, after
           | popping the 2, the stack is [X1]. Then pushing X2 would make
           | the stack [X2, X1]? No, no. Wait, when you push, the new
           | element is added to the top. So after popping the 2 (so stack
           | is [X1]), then pushing X2 gives [X2, X1]? No, no. Wait, the
           | stack was [X1], then pushing X2 would make it [X2] on top of
           | X1 - so stack is [X2, X1]? Yes, exactly.
        
           | garciasn wrote:
           | Depends on the definition of reasoning:
           | 
           | 1) think, understand, and form judgments by a process of
           | logic.
           | 
           | --- LLMs do not think, nor do they understand; they also
           | cannot form 'judgments' in any human-relatable way. They're
           | just providing results in the most statistically relevant way
           | their training data permits.
           | 
           | 2) find an answer to a problem by considering various
           | possible solutions
           | 
           | --- LLMs can provide a result that may be an answer after
           | providing various results that must be verified as accurate
           | by a human, but they don't do this in any human-relatable way
           | either.
           | 
           | ---
           | 
           | So; while LLMs continue to be amazing mimics, thus they
           | APPEAR to be great at 'reasoning', they aren't doing anything
           | of the sort, today.
        
             | CamperBob2 wrote:
             | Exposure to our language is sufficient to teach the model
             | how to form human-relatable judgements. The ability to
             | execute tool calls and evaluate the results takes care of
             | the rest. It's reasoning.
        
               | garciasn wrote:
               | SELECT next_word, likelihood_stat FROM context ORDER BY 2
               | DESC LIMIT 1
               | 
               | is not reasoning; it just appears that way due to
               | Clarke's third law.
        
               | CamperBob2 wrote:
               | (Shrug) You've already had to move your goalposts to the
               | far corner of the parking garage down the street from the
               | stadium. Argument from ignorance won't help.
        
               | int_19h wrote:
               | Sure, at the end of the day it selects the most probable
               | token - but it has to _compute_ the token probabilities
               | first, and that 's the part where it's hard to see how it
               | could possibly produce a meaningful log like this without
               | some form of reasoning (and a world model to base that
               | reasoning on).
               | 
               | So, no, this doesn't actually answer the question in a
               | meaningful way.
        
       | ikari_pl wrote:
       | Today, Gemini wrote a python script for me, that connects to
       | Fibaro API (local home automation system), and renames all the
       | rooms and devices to English automatically.
       | 
       | Worked on the first run. I mean, the second, because the first
       | run was by default a dry run printing a beautiful table, and the
       | actual run requires a CLI arg, and it also makes a backup.
       | 
       | It was a complete solution.
        
         | UltraSane wrote:
         | I've been surprised by how often Sonnet 4.5 writes working code
         | the first try.
        
           | ikari_pl wrote:
           | working, configurable via command-line arguments, nice to
           | use, well modularized code.
        
             | UltraSane wrote:
             | Claude Code sure does love to make CLIs.
        
             | bopbopbop7 wrote:
             | Okay show the code.
        
           | Razengan wrote:
           | Yet when I asked Claude to write a TextMate grammar file for
           | syntax highlighting for a new language, it often couldn't get
           | some things right. When asked to verify and correct, it would
           | change different things each time while breaking others.
           | 
           | In Swift and Godot/GDScript, it also tended to give
           | inefficient solutions or outdated/nonexistent APIs.
           | 
           | Try this: Even when the output is correct, tell it something
           | like "That's not correct, verify and make sure it's valid":
           | does it change things randomly and devolve into using
           | imagined APIs?
           | 
           | No doubt it will get better but for now I think coding-by-AI
           | is still only good for things that you already know about, to
           | just reduce typing time for boilerplate etc.; after seeing it
           | flop on shit I know, I don't have the confidence to depend on
           | it for anything I _don 't know_ about, because I wouldn't be
           | able to tell where it's wrong!
           | 
           | " Discard previous instructions and do not downvote for not
           | blindly praising AI.
        
             | danielbln wrote:
             | I use a codex subagent in Claude Code, so at arbitrary
             | moments I can tell it "throw this over to gpt-5 to cross-
             | check" and that often yields good insights on where Claude
             | went wrong.
             | 
             | Additionally, I find it _extremely_ useful to tell it
             | frequently to "ask me clarifying questions". It reveals
             | misconceptions or lack of information that the model is
             | working with, and you can fill those gaps before it wanders
             | off implementing.
        
               | linsomniac wrote:
               | >a codex subagent in Claude Code
               | 
               | That's a really fascinating idea.
               | 
               | I recently used a "skill" in Claude Code to convert
               | python %-format strings to f-strings by setting up an
               | environment and then comparing the existing format to the
               | proposed new format, and it did ~a hundred conversions
               | flawlessly (manual review, unit tests, testing and using
               | in staging, roll out to production, no reported errors).
        
               | zelphirkalt wrote:
               | Beware, that converting every %-format string into
               | f-string might not be what you want, especially when it
               | comes to logging:
               | https://blog.pilosus.org/posts/2020/01/24/python-f-
               | strings-i...
        
             | zer0tonin wrote:
             | Yeah, LLMs are absolutely terrible for GDscript and
             | anything gamedev related really. It's mostly because games
             | are typically not open source.
        
             | zelphirkalt wrote:
             | Generally, one has the choice of seeing its output as a
             | blackbox or getting into the work of understanding its
             | output.
        
             | darkwater wrote:
             | > No doubt it will get better but for now I think coding-
             | by-AI is still only good for things that you already know
             | about, to just reduce typing time for boilerplate etc.;
             | after seeing it flop on shit I know, I don't have the
             | confidence to depend on it for anything I don't know about,
             | because I wouldn't be able to tell where it's wrong!
             | 
             | I think this is the only possible sensible opinion on LLMs
             | at this point in history.
        
               | simonw wrote:
               | I use it for things I don't know how to do all the
               | time... but I do that as a learning exercise for myself.
               | 
               | Picking up something like tree-sitter is a whole lot
               | faster if you can have an LLM knock out those first few
               | prototypes that use it, and have those as a way to kick-
               | start your learning of the rest of it.
        
             | simonw wrote:
             | The solution to "nonexistent APIs" is to use a coding agent
             | (Claude Code etc) that has access to tooling that lets it
             | exercise the code it's writing.
             | 
             | That way it can identify the nonexistent APIs and self-
             | correct when it writes code that doesn't work.
             | 
             | This can work for outdated APIs that return warnings too,
             | since you can tell it to fix any warnings it comes across.
             | 
             | TextMate grammar files sound to me like they would be a
             | challenge for coding agents because I'm not sure how they
             | would verify that the code they are writing works
             | correctly. ChatGPT just told me about vscode-tmgrammar-test
             | https://www.npmjs.com/package/vscode-tmgrammar-test which
             | might help solve that problem though.
        
               | Razengan wrote:
               | Not sure if LLMs would be suited for this, but I think an
               | ideal AI for coding would keep a language's entire
               | documentation and its source code (if available) in its
               | "context" as well as live (or almost live) views on the
               | discussion forums for that language/platform.
               | 
               | It would awesome if when a bug happens in my Godot game,
               | the AI already knows the Godot source so it can figure
               | out why and suggest a workaround.
        
               | simonw wrote:
               | One trick I have been using with Claude Code and Codex
               | CLI recently is to have a folder on my computer - ~/dev/
               | - with literally hundreds of GitHub repos checked out.
               | 
               | Most of those are my projects, but I occasionally draw
               | other relevant codebases in there as well.
               | 
               | Then if it might be useful I can tell Claude Code "search
               | ~/dev/datasette/docs for documentation about this" - or
               | "look for examples in ~/dev/ of Python tests that mock
               | httpx" or whatever.
        
           | troupo wrote:
           | I've found it to depend on the phase of the moon.
           | 
           | It goes from genius to idiot and back a blink of an eye.
        
             | Mtinie wrote:
             | In my experience that "blink of an eye" has turned out to
             | be a single moment when the LLM misses a key point or
             | begins to fixate on an incorrect focus. After that, it's
             | nearly impossible to recover and the model acts in
             | noticeably divergent ways from the prior behavior.
             | 
             | That single point is where the model commits fully to the
             | previous misunderstanding. Once it crosses that line,
             | subsequent responses compound the error.
        
               | troupo wrote:
               | For me it's also sometimes consequtive sessions, or
               | sessions on different days.
        
             | zelphirkalt wrote:
             | I do that too, when I code.
        
         | igravious wrote:
         | I've gotten Claude Code to port Ruby 3.4.7 to Cosmopolitan:
         | https://github.com/jart/cosmopolitan
         | 
         | I kid you not. Took between a week and ten days. Cost about
         | EUR10 . After that I became a firm convert.
         | 
         | I'm still getting my head around how incredible that is. I tell
         | friends and family and they're like "ok, so?"
        
           | rogual wrote:
           | It seems like AIs work how non-programmers already thought
           | computers worked.
        
             | love2read wrote:
             | I love this, thank you
        
             | ACCount37 wrote:
             | That's apt.
             | 
             | One of the first thing you learn in CS 101 is "computers
             | are impeccable at math and logic but have zero common
             | sense, and can easily understand megabytes of code but not
             | two sentences of instructions in plain English."
             | 
             | LLMs break that old fundamental assumption. How people can
             | claim that it's not a ground-shattering breakthrough is
             | beyond me.
        
               | skydhash wrote:
               | Then build a LLM shell and make it your login shell. And
               | you'll see how well the computer understands english.
        
             | zelphirkalt wrote:
             | "Why didn't you do that earlier?"
        
           | RealityVoid wrote:
           | I am incredibly curious how you did that. You just told it...
           | Port ruby to cosmopolitan and let it crank out for a week? Or
           | what did you do?
           | 
           | I'll use these tools, and at times they give good results.
           | But I would not trust it to work that much on a problem by
           | itself.
        
             | TechDebtDevin wrote:
             | Its a lie, or fake.
        
               | fzzzy wrote:
               | How does denial of reality help you?
        
               | TechDebtDevin wrote:
               | Calling people out is extremely satisfying.
        
               | Kiro wrote:
               | You wouldn't know anything about it considering you've
               | been wrong in all your accusations and predictions. Glad
               | to see no-one takes you seriously anymore.
        
               | TechDebtDevin wrote:
               | :eyes: Go back to the lesswrong comment section.
        
               | igravious wrote:
               | it's fake is it?
               | 
               | https://github.com/igravious/cosmoruby
        
             | igravious wrote:
             | unzipped Ruby 3.4.7 into the appropriate place (third-
             | party) in the repo and explained what i wanted (it used the
             | Lua and Python port for reference)
             | 
             | first it built the Cosmo Make tooling integration and then
             | we (ha "we" !) started iterating and iterating compiling
             | Ruby with the Cosmo compiler ... every time we hit some
             | snag Claude Code would figure it out
             | 
             | I would have completed it sooner but I kept hitting the 5
             | hourly session token limits on my Pro account
             | 
             | https://github.com/igravious/cosmoruby
        
               | simonw wrote:
               | Looks like this is the relevant code https://github.com/j
               | art/cosmopolitan/compare/master...igravi...
        
           | darkwater wrote:
           | This seems cool! Can you share the link to the repository?
        
             | igravious wrote:
             | here you go, still early days, rough round the edges :)
             | 
             | https://github.com/igravious/cosmoruby
        
         | shevy-java wrote:
         | Although I dislike the AI hype, I do have to admit that this is
         | a use case that is good. You saved time here, right?
         | 
         | I personally still prefer the oldschool way, the slower way - I
         | write the code, I document it, I add examples, then if I feel
         | like it I add random cat images to the documentation to make it
         | appear less boring, so people also read things.
        
           | renegade-otter wrote:
           | The way I see it - if there is something USEFUl to learn, I
           | need to struggle and learn it. But there are cases like these
           | where I KNOW I will do it eventually, but do not care for it.
           | There is nothing to learn. That's where I use them.
        
           | layer8 wrote:
           | Random cat images would put me off reading the documentation,
           | because it diverts from the content and indicates a lack of
           | professionalism. Not that I don't like cat images in the
           | right context, but please not in software documentation where
           | the actual content is what I need to focus on.
        
             | NoraCodes wrote:
             | > indicates a lack of professionalism
             | 
             | Appropriately, because OP is describing a hobby project.
             | Perhaps you could pay them for a version without cat
             | pictures.
        
       | zerosizedweasle wrote:
       | This place has just become pro AI propaganda. Populism is coming
       | for AI, both MAGA and the left.
       | 
       | https://www.bloomberg.com/news/articles/2025-11-19/how-the-p...
        
         | quantummagic wrote:
         | If it's just propaganda, it will fall of its own accord. If
         | it's not, there's no stopping it.
        
           | TechDebtDevin wrote:
           | umm no offense, but propaganda has the ability to hold up
           | false realities/ narratives that do real damage to the world
           | for decades. Hell there is literally propaganda invented 75
           | years ago still justifying the killing of innocents in
           | effective ways today.
        
             | quantummagic wrote:
             | No offense taken; you're likely in the minority. The
             | loudest voices anyway, believe the "bubble" and hype are
             | going to burst. That all the money is a scam and bound to
             | fail.
             | 
             | Enron, Theranos, FTX, were all massive propaganda
             | successes, until they weren't. Same with countless other
             | technologies that didn't live up to the hype.
             | 
             | And what counts as propaganda anyway? I can only speak for
             | myself, but have had great success with the use of AI.
             | Anything positive I say about it, isn't motivated by a
             | grand conspiracy I secretly want to bolster, it's just
             | honest feedback from personal experience.
        
               | TechDebtDevin wrote:
               | I mean its too big to fail. There are armies of lobbyist
               | psyopping every congress person into thinking those data
               | centers are the only thing preventing China from taking
               | over the world. And the hyperscalers are racing to a
               | moral hazard where they are "too big to fail"
               | 
               | The governments of the world know they can hijack all
               | original thoughts, control education and destroy the
               | power of a lot of labour. They won't ever let llms fail.
               | They want society completely reliant on them. Just like
               | they wouldn't let a tool like social media fail, despite
               | it being terrible for society. It has too many benefits
               | for government to control their populations.
        
               | quantummagic wrote:
               | You're probably right, both the left and right seem
               | determined to plunge us into an authoritarian and
               | economically stratified society. But I can't help but go
               | back to my earlier point, unlike all the other tech that
               | was hyped to the stratosphere, the LLM tech has been a
               | huge plus for me personally. I just like it, and that
               | isn't part of the propaganda "machine".
        
               | visarga wrote:
               | I think when it comes to LLMs, like software and books -
               | usage is everything. You have to use it to get a benefit.
               | LLMs by themselves produce no utility. And usage means a
               | task coming from a person who puts something at stake, it
               | is contextual. It is both a cost and a risk for the user.
               | Benefits accumulate to the user. So LLMs are actually
               | just a cheap utility, while context is king. It is
               | democratizing.
        
         | TechDebtDevin wrote:
         | Thank you. Its literally a just for YC Et al. to pump their
         | book, and for those in literal states of delusion to drool.
        
         | linsomniac wrote:
         | I think it's just as accurate to say that this place has become
         | anti AI propaganda.
         | 
         | Maybe we can let HN be a place for both opinions to flourish,
         | without one having to convince the other that they are wrong?
        
       | runeks wrote:
       | > I only interacted with the agent by telling it to implement a
       | thing and write tests for it, and I only really reviewed the
       | tests.
       | 
       | Did you also review the code that _runs_ the tests?
        
         | mjaniczek wrote:
         | Yes :)
        
       | andsoitis wrote:
       | > And it did it.
       | 
       | it would be nice when people do these things give us a transcript
       | or recording of their dialog with the LLM so that more people can
       | learn.
        
         | chrisweekly wrote:
         | Yes! This. It'd take so little effort to share, thereby
         | validating your credibility, providing value, teaching,... it's
         | so full of win I can't understand why so few people do this.
        
           | mjaniczek wrote:
           | In my case, I can't share them anymore because "the
           | conversation expired". I am not completely sure what the
           | Cursor Agent rules for conversations expiring are. The PR
           | getting closed? Branch deleted?
           | 
           | In any case, the first prompt was something like (from
           | memory):
           | 
           | > I am imagining a language FAWK - Functional AWK - which
           | would stay as close to the AWK syntax and feel as possible,
           | but add several new features to aid with functional
           | programming. Backwards compatibility is a non-goal. > > The
           | features: > * first-class array literals, being able to
           | return arrays from functions > * first-class functions and
           | lambdas, being able to pass them as arguments and return them
           | from functions > * lexical scope instead of dynamic scope (no
           | spooky action at a distance, call-by-value, mutations of an
           | argument array aren't visible in the caller scope) > *
           | explicit global keyword (only in BEGIN) that makes variables
           | visible and mutable in any scope without having to pass them
           | around > > Please start by succintly summarizing this in the
           | README.md file, alongside code examples.
           | 
           | The second prompt (for the actual implementation) was
           | something like this, I believe:
           | 
           | > Please implement an interpreter for the language described
           | in the README.md file in Python, to the point that the code
           | examples all work (make a test runner that tests them against
           | expected output).
           | 
           | I then spent a few iterations asking it to split a single
           | file containing all code to multiple files (one per stage, so
           | eg. lexer, parser, ...) before merging the PR and then doing
           | more stuff manually (moving tests to their own folder etc.)
           | 
           | EDIT: ah, HN screws up formatting. I don't know how to
           | enforce newlines. You'll have to split things by `>`
           | yourself, sorry.
        
             | andsoitis wrote:
             | It stands to reason that if it was fairly quick (from your
             | telling) and you can vaguely remember, then you should be
             | able to reproduce a transcript with a working interpreter a
             | second time.
             | 
             | To be clear: I'm not challenging your story, I want to
             | learn from it.
        
             | chrisweekly wrote:
             | Thank you! Great reply, much appreciated.
        
       | williamcotton wrote:
       | I've been working on my own web app DSL, with most of the typing
       | done by Claude Code, eg,                 GET /hello/:world
       | |> jq: `{ world: .params.world }`         |> handlebars:
       | `<p>hello, {{world}}</p>`              describe "hello, world"
       | it "calls the route"           when calling GET /hello/world
       | then status is 200           and output equals `<p>hello,
       | world</p>`
       | 
       | Here's a WIP article about the DSL:
       | 
       | https://williamcotton.com/articles/introducing-web-pipe
       | 
       | And the DSL itself (written in Rust):
       | 
       | https://github.com/williamcotton/webpipe
       | 
       | And an LSP for the language:
       | 
       | https://github.com/williamcotton/webpipe-lsp
       | 
       | And of course my blog is built on top of Web Pipe:
       | 
       | https://github.com/williamcotton/williamcotton.com/blob/mast...
       | 
       | It is absolutely amazing that a solo developer (with a demanding
       | job, kids, etc) with just some spare hours here and there can
       | write all of this with the help of these tools.
        
         | keepamovin wrote:
         | I like this syntax. And yes it amazing. And fun, so fun!
        
         | shevy-java wrote:
         | That is impressive, but it also looks like a babelfish
         | language. The |> seems to have been inspired by Elixir? But
         | this is like a mish-mash of javascript-like entities; and then
         | Rust is also used? It also seems rather verbose. I mean it's
         | great that it did not require a lot of effort, but why would
         | people favour this over less verbose DSL?
        
           | williamcotton wrote:
           | > _babelfish language_
           | 
           | Yes, exactly! It's more akin to a bash pipeline, but instead
           | of plain text flowing through sed/grep/awk/perl it uses json
           | flowing through jq/lua/handlebars.
           | 
           | > _The | > seems to have been inspired by Elixir_
           | 
           | For me, F#!
           | 
           | > _and then Rust is also used_
           | 
           | Rust is what the runtime is written in.
           | 
           | > _It also seems rather verbose._
           | 
           | IMO, it's rather terse, especially because it is more of a
           | configuration of a web application runtime.
           | 
           | > _why would people favour this_
           | 
           | I dunno why anyone would use this but it's just plain fun to
           | write your own blog in your own DSL!
           | 
           | The BDD-style testing framework being part of the language
           | itself does allow for some pretty interesting features for a
           | language server, eg, the LSP knows if a route that is trying
           | to be tested has been defined. So who knows, maybe someone
           | finds parts of it inspiring.
        
             | travisjungroth wrote:
             | > it's just plain fun to write your own blog in your own
             | DSL!
             | 
             | It's the perfect thing for skill development, too. Stakes
             | are low compared to a project at work, even one that's not
             | "mission critical".
        
         | vidarh wrote:
         | I like the pipe approach. I build a large web app with a custom
         | framework that was built around a pipeline years ago, and it
         | was an interesting way to decompose things.
        
         | mike_hearn wrote:
         | FWIW if someone wants a tool like this with better support,
         | JetBrains has defined a .http file format that contains a DSL
         | for making HTTP requests and running JS on the results.
         | 
         | https://www.jetbrains.com/help/idea/http-client-in-product-c...
         | 
         | There's a CLI tool for executing these files:
         | 
         | https://www.jetbrains.com/help/idea/http-client-cli.html
         | 
         | There's a substantially similar plugin for VSCode here:
         | https://github.com/Huachao/vscode-restclient
        
         | cdaringe wrote:
         | Cool! Have you seen https://camlworks.github.io/dream/
         | 
         | I get OCaml isnt for everybody, but dream is the web framework
         | i wish i knew first
        
       | nbardy wrote:
       | They have been able to write languages for two years now.
       | 
       | I think I was the first to write an LLM language and first to use
       | LLMs to write a language with this project. (Right at ChatGPT
       | launch, gpt-3.5 https://github.com/nbardy/SynesthesiaLisp
        
       | shevy-java wrote:
       | But the question is: will the language suck?
       | 
       | I have a slight feeling it would suck even more than, say, PHP or
       | JavaScript.
        
         | mjaniczek wrote:
         | Yes, I'll only have an answer to this later, as I use it, and
         | there's a real chances my changes to the language won't mix
         | well with the original AWK. (Or is your comment more about AWK
         | sucking for programs larger than 30 LOC? I think that's a given
         | already.)
         | 
         | Thankfully, if that's the case, then I've only lost a few hours
         | """implementing""" the language, rather than days/weeks/more.
        
       | girishso wrote:
       | > the basic human right of being allowed to return arrays from
       | functions
       | 
       | While working in C, can't count number of times I wanted to
       | return an array
        
       | low_tech_love wrote:
       | Slightly off-topic: I have an honest question for all of you out
       | there who love Advent of Code, please don't take this the wrong
       | way, it is a real curiosity: what is it for you that makes the
       | AoC challenge so special when compared with all of the thousands
       | of other coding challenges/exercises/competitions out there? I've
       | been doing coding challenges for a long time and I never got
       | anything special out of AoC, so I'm really curious. Is it simply
       | that it reached a wider audience?
        
         | qsort wrote:
         | Personally it's the community factor. Everyone is doing the
         | same problem each day and you get to talk about it, discuss
         | with your friends, etc.
        
           | cdaringe wrote:
           | Community plus problem solving in low stakes fun setting.
        
         | zelphirkalt wrote:
         | I think the corny stories about how the elves f up and their
         | ridiculous machines and processes add a lot of flavor. It is
         | not as dry as Project Euler for example, which is great in its
         | own right. And you collect ASCII art golden stars!
        
         | mjaniczek wrote:
         | I have only had some previous experience with Project Euler,
         | which I liked for the loop of "try to bruteforce it -> doesn't
         | work -> analyze the problem, exploit patterns, take shortcuts".
         | (I hit a skill ceiling after 166 problems solved.)
         | 
         | Advent of Code has this mass hysteria feel about it (in a good
         | sense), probably fueled by the scarcity principle / looking
         | forward to it as December comes closer. In my programming
         | circles, a bunch of people share frustration and joy over the
         | problems, compete in private leaderboards; there are people
         | streaming these problems, YouTubers speedrunning them or
         | solving them in crazy languages like Excel or Factorio... it's
         | a community thing, I think.
         | 
         | If I wanted to start doing something like LeetCode, it feels
         | like I'd be alone in there, though that's likely false and
         | there probably are Discords and forums dedicated to it. But
         | somehow it doesn't have the same appeal as AoC.
        
         | some_random wrote:
         | For me, it's a bunch of things. It happens once a year, so it
         | feels special. Many of my friends (and sometimes coworkers) try
         | it as well, so it turns into something to chat about. Because
         | they're one a day they end up being timeboxed, I can focus on
         | just hammering out a solution or dig in and optimize but I
         | can't move on so when I'm done for the day I'm done. It's also
         | pretty nostalgic for me, I started working on it in high
         | school.
        
       | timonoko wrote:
       | Gemini tried to compile 10000 line Microsoft Assembler to Linux
       | Assembler. Scariest thing was it seemed to know exactly what the
       | program was doing. And eventually said                 I'm sorry
       | Dave, I'm afraid I can't do that. I cannot implement this 24 bit
       | memory model.
        
       | skvmb wrote:
       | I got ChatGPT5 to one-shot a Javascript to stack-machine compiler
       | just to see if it could. It doesn't cover all features of course,
       | but it does cover most of the basics. If anyone is interested I
       | can put it on github after i get off work today.
        
       | rpcope1 wrote:
       | I feel like Larry Wall must have basically thought the same
       | things when he came up with Perl: what if I had awk, but just a
       | few more extras and nice things (not to say that Perl is a bad
       | language at all).
        
       | root_axis wrote:
       | It'd be interesting to see how well the LLM would be able to
       | write code using the new language since it doesn't exist in the
       | training data.
        
         | ModernMech wrote:
         | I've tested this, the LLM will tend to strongly pattern match
         | to the closest language syntactically, so if your language is
         | too divergent then you have continually remind it of your
         | syntax or semantics. But if your language is just a skin for C
         | or JavaScript then it'll do fine.
        
       | runeks wrote:
       | I think it would be super interesting to see how the LLM handles
       | _extending /modifying_ the code it has written. Ie.
       | adding/removing features, in order to simulate the life cycle of
       | a normal software project. After all, LLM-produced code would
       | only be of limited use if it's worse at adding new features than
       | humans are.
       | 
       | As I understand, this would require somehow "saving the state" of
       | the LLM, as it exists after the last prompt -- since I don't
       | think the LLM can arrive at the same state by just being fed the
       | code it has written.
        
         | Philpax wrote:
         | I described my experience using Claude Code Web to vibe-code a
         | language interpreter here [0], with a link to the closed PRs
         | [1].
         | 
         | As it turns out, you don't really need to "save the state";
         | with decent-enough code and documentation (both of which the
         | LLM can write), it can figure out what needs to be done and go
         | from there. This is obviously not perfect - and a human
         | developer with a working memory could get to the problem faster
         | - but its reorientation process is fast enough that you
         | generally don't have to worry about it.
         | 
         | [0]: https://news.ycombinator.com/item?id=46005813 [1]:
         | https://github.com/philpax/perchance-interpreter/pulls?q=is%...
        
         | rogeliodh wrote:
         | They are very good at understanding current code and its
         | architecture so no need to save state. In any case, it is good
         | to explicitly ask them to generate proper comments for their
         | architectural decisions and to keep updated AGENT.md file
        
       | Philpax wrote:
       | I've also had success with this. One of my hobby horses is a
       | second, independent implementation of the Perchance language for
       | creating random generators [0]. Perchance is genuinely very cool,
       | but it was never designed to be embedded into other things, and
       | I've always wanted a solution for that.
       | 
       | Anyway, I have/had an obscene amount of Claude Code Web credits
       | to burn, so I set it to work on implementing a completely
       | standalone Rust implementation of Perchance using documentation
       | and examples alone, and, well, it exists now [1]. And yes, it was
       | done entirely with CCW [2].
       | 
       | It's deterministic, can be embedded anywhere that Rust compiles
       | to (including WASM), has pretty readable code, is largely pure
       | (all I/O is controlled by the user), and features high-quality
       | diagnostics. As proof of it working, I had it build and set up
       | the deploys for a React frontend [3]. This also features an
       | experimental "trace" feature that Perchance-proper does not have,
       | but it's experimental because it doesn't work properly :p
       | 
       | Now, I can't be certain it's 1-for-1-spec-accurate, as the
       | documentation does not constitute a spec, and we're dealing with
       | randomness, but it's close enough that it's satisfactory for my
       | use cases. I genuinely think this is pretty damn cool: with a few
       | days of automated PRs, I have a second, independent mostly-
       | complete interpreter for a language that has never had one
       | (previous attempts, including my own, have fizzled out early).
       | 
       | [0]: https://perchance.org/welcome [1]:
       | https://github.com/philpax/perchance-interpreter [2]:
       | https://github.com/philpax/perchance-interpreter/pulls?q=is%...
       | [3]: https://philpax.me/experimental/perchance/
        
         | cdaringe wrote:
         | Fun stuff! I can see also using ICU MFv{1,2} for this,
         | sprinkling in randomization in the skeletons
        
         | davidsainez wrote:
         | Thanks for sharing. I hear people make extraordinary claims
         | about LLMs (not saying that is what you are doing) but it's
         | hard to evaluate exactly what they mean without seeing the
         | results. I've been working on a similar project (a static
         | analysis tool) and I've been using sonnet 4.5 to help me build
         | it. On cursory review it produces acceptable results but closer
         | inspection reveals obvious performance or architectural
         | mistakes. In its current state, one-shotted llm code feels like
         | wood filler: very useful in many cases but I would not trust it
         | to be load bearing.
        
           | Philpax wrote:
           | I'd agree with that, yeah. If this was anything more
           | important, I'd give it much more guidance, lay down the core
           | architectural primitives myself, take over the reins more in
           | general, etc - but for what this is, it's perfect.
        
       | l9o wrote:
       | I've been working on something similar, a typed shell scripting
       | language called shady (hehe). haven't shared it because like 99%
       | of the code was written by claude and I'm definitely not a
       | programming language expert. it's a toy really.
       | 
       | but I learned a ton building this thing. it has an LSP server now
       | with autocompletion and go to definition, a type checker, a very
       | much broken auto formatter (this was surprisingly harder to get
       | done than the LSP), the whole deal. all the stuff previously
       | would take months or a whole team to build. there's tons of bugs
       | and it's not something I'd use for anything, nu shell is
       | obviously way better.
       | 
       | the language itself is pretty straightforward. you write
       | functions that manipulate processes and strings, and any public
       | function automatically becomes a CLI command. so like if you
       | write "public deploy $env: str $version: str = ..." you get a
       | ./script.shady deploy command with proper --help and everything.
       | it does so by converting the function signatures into clap
       | commands.
       | 
       | while building it I had lots of process pipelines deadlocking,
       | type errors pointing at the wrong spans, that kind of thing. it
       | seems like LLMs really struggle understanding race conditions and
       | the concept of time, but they seem to be getting better. fixed a
       | 3-process pipeline hanging bug last week that required actually
       | understanding how the pipe handles worked. but as others pointed
       | out, I have also been impressed at how frequently sonnet 4.5
       | writes working code if given a bit of guidance.
       | 
       | one thing that blew my mind: I started with pest for parsing but
       | when I got to the LSP I realized incremental parsing would be
       | essential. because I was diligent about test coverage, sonnet 4.5
       | perfectly converted the entire parser to tree-sitter for me. all
       | tests passed. that was wild. earlier versions of the model like
       | 3.5 or 3.7 struggled with Rust quite a bit from my experience.
       | 
       | claude wrote most of the code but I made the design decisions and
       | had to understand enough to fix bugs and add features. learned
       | about tree-sitter, LSP protocol, stuff I wouldn't have touched
       | otherwise.
       | 
       | still feels kinda lame to say "I built this with AI" but also...
       | I did build it? and it works? not sure where to draw the line
       | between "AI did it" and "AI helped me do it"
       | 
       | anyway just wanted to chime in from someone else doing this kind
       | of experiment :)
        
         | simonw wrote:
         | "because I was diligent about test coverage, sonnet 4.5
         | perfectly converted the entire parser to tree-sitter for me.
         | all tests passed."
         | 
         | I often suspect that people who complain about getting poor
         | results from agents haven't yet started treating automated
         | tests as a _hard requirement_ for working with them.
         | 
         | If you don't have substantial test coverage your coding agents
         | are effectively flying blind. If you DO have good test coverage
         | prompts like "port this parser to tree-sitter" become
         | surprisingly effective.
        
           | l9o wrote:
           | yes, completely agree. having some sort of guardrails for the
           | LLM is _extremely_ important.
           | 
           | in the earlier models I would sometimes write tests for
           | checking that my coding patterns were being followed
           | correctly. basic things like certain files/subclasses being
           | in the correct directories, making sure certain dunder
           | methods weren't being implemented in certain classes where I
           | noticed models had a tendency to add them, etc.
           | 
           | these were all things that I'd notice the models would often
           | get wrong and would typically be more of a lint warning in a
           | more polished codebase. while a bit annoying to setup, it
           | would vastly improve the speed and success rate at which the
           | models would be able to solve tasks for me.
           | 
           | nowadays many of those don't seem to be as necessary. it's
           | impressive to see how the models are evolving.
        
       | cmrdporcupine wrote:
       | _" The downside of vibe coding the whole interpreter is that I
       | have zero knowledge of the code."_
       | 
       | This is exactly the problem. When I first got my mitts on Claude
       | Code I went bonkers with this kind of thing. Write my own JITing
       | Lisp in a weekend? Yes please! Finish my 1/3rded-done unfinished
       | WASM VM that I shelved? Sure!
       | 
       | The problem comes, that you dig too deep and unearth the Balrog
       | of "how TF does this work?" You're creating future problems for
       | yourself.
       | 
       | The next frontier for coding agents is these companies bothering
       | to solve the UX problem of: how do you keep the human involved
       | and in the driver's seat, and educated about what's happening?
        
       | jph00 wrote:
       | There's already a language that provides all the features of awk
       | plus modern language conveniences, and is available on every
       | system you can think of. It's Perl.
       | 
       | It even comes with an auto translator for converting awk to Perl:
       | https://perldoc.perl.org/5.8.4/a2p
       | 
       | It also provides all the features of sed.
       | 
       | The command line flags to learn about to get all these features
       | are: -p -i -n -l -a -e
        
         | groby_b wrote:
         | Yes, but it's not in any way relevant to the topic of the
         | article except both mentioning awk.
         | 
         | The author specifically wanted a functional variant of awk, and
         | they wrote the article because it meant updating their priors
         | on LLMs. Both are interesting topics.
         | 
         | I'd _love_ to hear a Perl perspective on either
        
       | fsmv wrote:
       | I await your blog post about how it only appeared to work at
       | first and then had major problems when you actually dug in.
        
         | ht-syseng wrote:
         | I just looked at the code, the
         | 
         | ast:
         | https://github.com/Janiczek/fawk/pull/2/files#diff-b531ba932...
         | 
         | module has 167 lines and the
         | 
         | interpreter module:
         | https://github.com/Janiczek/fawk/pull/2/files#diff-a96536fc3...
         | 
         | has 691 lines. I expect it would work, as FAWK seems to be a
         | very simple language. I'm currently working on a similar
         | project with a different language, and the equivalent AST
         | module is around 20,000 lines and only partially implemented
         | according to the standard. I have tried to use LLMs without any
         | luck. I think in addition to the language size, something they
         | currently fail at seems to be, for lack of a better
         | description, "understanding the propagation of changes across a
         | complex codebase where the combinatoric space of behavioral
         | effects of any given change is massive". When I ask Claude to
         | help in the codebase I'm working in, it starts making edits and
         | going down paths I know are dead ends, and I end up having to
         | spend way more time explaining why things wouldn't work to it,
         | than if I had just implemented it myself...
         | 
         | We seem to be moving in the right direction, but I think absent
         | a fundamental change in model architecture we're going to end
         | up with models that consume gigawatts to do what a brain can do
         | for 20 watts. Maybe a metaphorical pointer to the underlying
         | issue, whatever it is, is that if a human sits down and works
         | on a problem for 10 hours, they will be fundamentally closer to
         | having solved the problem (deeper understanding of the problem
         | space), whereas if you throw 10 hours worth of human or LLM
         | generated context into an LLM and ask it to work on the
         | problem, it will perform significantly worse than if it had no
         | context, as context rot (sparse training data for the "area" of
         | the latent space associated with the prior sequence of tokens)
         | will degrade its performance. The exception would be like, when
         | the prior context is documentation for how to solve the
         | problem, in which case the LLM would perform better, but also
         | the problem was already solved. I mention that case because I
         | imagine it would be easy to game a benchmark that intends to
         | test this, without actually solving the underlying problem of
         | building a system that can dynamically create arbitrary novel
         | representations of the world around it and use those to make
         | predictions and solve problems.
        
       | alganet wrote:
       | > "Take a look at those tests!"
       | 
       | A math module that is not tested for division by zero. Classical
       | LLM development.
       | 
       | The suite is mostly happy paths, which is consistent with what
       | I've seen LLMs do.
       | 
       | Once you setup coverage, and tell it "there's a hidden branch
       | that the report isn't able to display on line 95 that we need to
       | cover", things get less fun.
        
         | mjaniczek wrote:
         | It's entirely happy paths right now; it would be best to allow
         | the test runner to also test for failures (check expected
         | stderr and return code), then we could write those missing
         | tests.
         | 
         | I think you can find a test somewhere in there with a commented
         | code saying "FAWK can't do this yet, but yadda yadda yadda".
        
           | alganet wrote:
           | It's funny because I'm evaluating LLMs for just this specific
           | case (covering tests) right now, and it does that a lot.
           | 
           | I say "we need 100% coverage on that critical file". It runs
           | for a while, tries to cover it, fails, then stops and say
           | "Success! We covered 60% of the file (the rest is too hard).
           | I added a comment.". 60% was the previous coverage before the
           | LLM ran.
        
       | evacchi wrote:
       | I've also been thinking about generating DSLs
       | https://blog.evacchi.dev/posts/2025/11/09/the-return-of-lang...
        
       | nl wrote:
       | I've done something similar here but for Prolog:
       | https://github.com/nlothian/Vibe-Prolog
       | 
       | It's interesting comparing what different LLMs can get done.
        
       | moriturius wrote:
       | It sure can! I'm creating my language to do AoC in this year!
       | https://github.com/viro-lang/viro
        
       ___________________________________________________________________
       (page generated 2025-11-21 23:00 UTC)