hngopher.com

       [HN Gopher] Smartfunc: Turn Docstrings into LLM-Functions
       ___________________________________________________________________
        
       Smartfunc: Turn Docstrings into LLM-Functions
        
       Author : alexmolas
       Score  : 55 points
       Date   : 2025-04-08 09:43 UTC (2 days ago)
        
 (HTM) web link (github.com)
 (TXT) w3m dump (github.com)
        
       | shaism wrote:
       | Very cool. I implemented something similar for personal use
       | before.
       | 
       | At that time, LLMs weren't as proficient in coding as they are
       | today. Nowadays, the decorator approach might even go further and
       | not just wrap LLM calls but also write Python code based on the
       | description in the Docstring.
       | 
       | This would incentivize writing unambiguous DocStrings, and
       | guarantee (if the LLMs don't hallucinate) consistency between
       | code and documentation.
       | 
       | It would bring us closer to the world that Jensen Huang
       | described, i.e., natural language becoming a programming
       | language.
        
         | psunavy03 wrote:
         | People have been talking about natural language becoming a
         | programming language for way longer than even Jensen Huang has
         | been talking about it. Once upon a time, they tried to adapt
         | natural language into a programming language, and they came up
         | with this thing called COBOL. Same idea: "then the managers can
         | code, and we won't need to hire so many expensive devs!"
         | 
         | And now the COBOL devs are retiring after a whole career . . .
        
           | pizza wrote:
           | But isn't it actually more like, COBOL lets you talk in
           | COBOL-ese (which is kinda stilted), whereas LLMs let you talk
           | in LLM-ese (which gets a lot closer to actual language)? And
           | then since the skill cap on language is basically infinite,
           | that this becomes a question of how good you are at saying
           | what you want - to the extent it intersects with what the LLM
           | can do.
        
             | psunavy03 wrote:
             | COBOL was the best attempt that they could get to in the
             | 1960s. It's the entire reason COBOL has things like
             | paragraphs, things end with periods, etc. They wanted as
             | much of an "English-like syntax" as possible.
             | 
             | The reason it looks so odd today is that so much of modern
             | software is instead the intellectual heir of C.
             | 
             | And yeah, the "skill cap" of describing things is
             | theoretically infinite. My point was this has been tried
             | before and we don't yet know how the actual limitations of
             | an LLM come close to that ideal. People have been trying
             | for decades to describe things in English that still
             | ultimately need to be described in code for them to work;
             | that's why the software industry exists in the first place.
        
       | lukev wrote:
       | This is the way LLM-enhanced coding should (and I believe will)
       | go.
       | 
       | Treating the LLM like a compiler is a much more scalable,
       | extensible and composable mental model than treating it like a
       | junior dev.
        
         | simonw wrote:
         | smartfunc doesn't really treat the LLM as a compiler - it's not
         | generating Python code to fill out the function, it's
         | converting that function into one that calls the LLM every time
         | you call the function passing in its docstring as a prompt.
         | 
         | A version that DID work like a compiler would be super
         | interesting - it could replace the function body with generated
         | Python code on your first call and then reuse that in the
         | future, maybe even caching state on disk rather than in-memory.
        
           | hedgehog wrote:
           | I use something similar to this decorator (more or less a
           | thin wrapper around instructor) and have looked a little bit
           | at the codegen + cache route. It gets more interesting with
           | the addition of tool calls, but I've found JSON outputs
           | create quality degradation and reliability issues. My next
           | experiment on that thread is to either use guidance
           | (https://github.com/guidance-ai/guidance) or reimplement some
           | of their heuristics to try to get tool calling without 100%
           | reliance on JSON.
        
           | toxik wrote:
           | Isn't that basically just Copilot but way more cumbersome to
           | use?
        
             | nate_nowack wrote:
             | no https://bsky.app/profile/alternatebuild.dev/post/3lg5a5f
             | q4dc...
        
           | photonthug wrote:
           | Treating it as a compiler is obviously the way right? Setting
           | aside overhead if you're using local models.. Either the code
           | gen is not deterministic in which case you risk random
           | breakage or it is deterministic and you decided to delete it
           | anyway and punt on ever changing / optimizing it except for
           | in natural language? Why would anyone prefer either case?
           | Code folding works fine if you just don't want to look at it
           | ever.
           | 
           | I can see this eventually going in the direction of
           | "bidirectional synchronization" of NL representation and code
           | representation (similar to how jupytext allows you work with
           | notebooks in browser or markdown in editor). But a single
           | representation that's completely NL and deliberately throwing
           | away a code representation sounds like it would be the
           | opposite of productivity..
        
           | huevosabio wrote:
           | Yes, that would be indeed very interesting.
           | 
           | I would like to try something like this in Rust: - you use a
           | macro to stub out the body of functions (so you just write
           | the signature) - the build step fills in the code and caches
           | it - on failures the, the build step is allowed to change the
           | function bodies generated by LLMs until it satisfies the test
           | / compile steps - you can then convert the satisfying LLM-
           | generated function bodies into a hard code (or leave it
           | within the domain of "changeable by the llm")
           | 
           | It sandboxes what the LLM can actually alter, and makes the
           | generation happen in an environment where you can check right
           | away if it was done correctly. Being Rust, you get a lot of
           | more verifications. And, crucially, keeps you in the driver's
           | seat.
        
           | lukev wrote:
           | Ah, cool, didn't read close enough.
           | 
           | Yeah, I do think that LLMs acting as compilers for super
           | high-level specs (the new "code") is a much better approach
           | than chatting with a bot to try to get the right code
           | written. LLM-derived code should not be "peer" to human-
           | written code IMO; it should exist at some subordinate level.
           | 
           | The fact that they're non-deterministic makes it a bit
           | different from a traditional compiler but as you say, caching
           | a "known good" artifact could work.
        
       | simonw wrote:
       | I really like how this integrates with the schema feature I added
       | to the underlying LLM Python library a few weeks ago:
       | https://simonwillison.net/2025/Feb/28/llm-schemas/#using-sch...
        
       | noddybear wrote:
       | Cool! Looks a lot like Tanuki:
       | https://github.com/Tanuki/tanuki.py
        
         | nate_nowack wrote:
         | yea its a popular DX at this point:
         | https://blog.alternatebuild.dev/marvin-3x/
        
       | miki123211 wrote:
       | There's also promptic which wraps litelm, which supports many,
       | many, many more model providers, and it doesn't even need
       | plugins.
       | 
       | Llm is a cool cli tool, but IMO litellm is a better Python
       | library.
        
         | simonw wrote:
         | I think LLM's plugin architecture is a better bet for
         | supporting model providers than the way LiteLLM does it.
         | 
         | The problem with LiteLLM's approach is that every model
         | provider needs to be added to the core library - in
         | https://github.com/BerriAI/litellm/tree/main/litellm/llms - and
         | then shipped as a new release.
         | 
         | LLM uses plugins because then there's no need to sync new
         | providers with the core tool. When a new Gemini feature comes
         | out I ship a new release of https://github.com/simonw/llm-
         | gemini - no need for a release of core.
         | 
         | I can wake up one morning and LLM grew support for a bunch of
         | new models overnight because someone else released a plugin.
         | 
         | I'm not saying "LLM is better than LiteLLM" here - LiteLLM is a
         | great library with a whole lot more contributors than LLM, and
         | it's also been fully focused on being a great Python library -
         | LLM has also had more effort invested in the CLI aspect than
         | the Python library aspect so far.
         | 
         | I am confident that a plugin system is a better way to solve
         | this problem generally though.
        
       | asadm wrote:
       | I was working on a similar thing but for JS.
       | 
       | Imagine this: It would be cool when these functions essentially
       | boiled down to a distilled tiny model just for that functionality
       | instead of an api call to foundation one.
        
       | dheera wrote:
       | I often do the reverse -- have LLMs insert docstrings into large,
       | poorly commented codebases that are hard to understand.
       | 
       | Pasting a piece of code into an LLM with the prompt "comment the
       | shit out of this" works quite well.
        
         | simonw wrote:
         | Matheus Pedroni released a really clever plugin for doing that
         | with LLM the other day: https://mathpn.com/posts/llm-docsmith/
         | 
         | You run it like this:                 llm install llm-docsmith
         | llm docsmith ./scripts/main.py
         | 
         | And it uses a Python concrete syntax tree (with
         | https://pypi.org/project/libcst/) to apply changes to just the
         | docstrings without risk of editing any other code.
        
       | nonethewiser wrote:
       | Funny. I frequently give the LLM the function and ask it to make
       | the doc string.
       | 
       | TBH I find doc strings very tedious to write. I can see how this
       | would be a great specification for an LLM but I dont know that
       | its actually better than a plain text description of the function
       | since LLMs can handle those just fine and they are easier to
       | write.
        
       | senko wrote:
       | Many libraries with the same approach suffer the same flaw: can't
       | easily use the same function with different LLMs at runtime (ie.
       | after importing the module where it is defined).
       | 
       | I initially used the same approach in my library, but changed it
       | to explicitly pass the llm object around and in actual production
       | code it's easier/more flexible to use.
       | 
       | Examples (2nd one also with docstring-based llm query and
       | structured answer): https://github.com/senko/think?tab=readme-ov-
       | file#examples
        
       ___________________________________________________________________
       (page generated 2025-04-10 23:01 UTC)