[HN Gopher] Betting on DSPy for Systems of LLMs
___________________________________________________________________
Betting on DSPy for Systems of LLMs
Author : wavelander
Score : 73 points
Date : 2024-08-11 02:11 UTC (20 hours ago)
(HTM) web link (blog.isaacmiller.dev)
(TXT) w3m dump (blog.isaacmiller.dev)
| okigan wrote:
| Could we have a concise and specific explanation how DSPy works?
|
| All I've seen are vague definitions of new terms (ex. signatures)
| and "trust me this very powerful and will optimize it all for
| you".
|
| Also, what would a good way to reason between DSPy and TextGrad?
| curious_cat_163 wrote:
| My understanding is that is tries many variations of the set of
| few shot examples and prompts and picks the ones that work best
| as the optimized program.
| ktrnka wrote:
| Textgrad mainly optimizes the prompt but does not inject few
| shot examples. Dspy mainly optimizes the few shot examples.
|
| At least that's my understanding from reading the textgrad
| paper recently.
| bart_spoon wrote:
| The more I've looked at DSPy, the less impressed I am. The design
| of the project is very confusing with non-sensical, convoluted
| abstractions. And for all the discussion surrounding it, I've yet
| to see someone actually _using_ for something other than a toy
| example. I'm not sure I've even seen someone prove it can do what
| it claims to in terms of prompt optimization.
|
| It reminds me very much of Langchain in that it feels like a
| rushed, unnecessary set of abstractions that add more friction
| than actual benefit, and ultimately boils down to an attempt to
| stake a claim as a major framework in the still very young stages
| of LLMs, as opposed to solving an actual problem.
| Der_Einzige wrote:
| Agreed 100%. DSPy along with libraries inspired by it (i.e.
| https://github.com/zou-group/textgrad) are nothing more than
| fancy prompt chains under the hood.
|
| These libraries mostly exist as "cope" for the fact that we
| don't have good fine-tuning (i.e. lora) capabilities for
| ChatGPT et al, so we try to instead optimize the prompt.
| qeternity wrote:
| Glad to see others saying this. I haven't looked at it in
| some months, but I previously realized it's mostly a very
| complicated way to optimize few-shot learning prompts. It's
| hardly whatever magical blackbox optimizer they try to market
| it as.
| dmarchand90 wrote:
| My guess is it will be like pascal or smalltalk, an important
| development for illustrating a concept but is ultimately
| replaced by something more rigorous
| isaacbmiller wrote:
| > _These libraries mostly exist as "cope"_
|
| > _nothing more than fancy prompt chains under the hood_
|
| Some approaches using steering vectors, clever ways of fine-
| tuning, transfer decoding, some tree search sampling-esque
| approaches, and others all seem very promising.
|
| DSPy is, yes, ultimately a fancy prompt chain. Even once we
| integrate some of the other approaches, I don't think it
| becomes a single-lever problem where we can only change one
| thing(e.g., fine-tune a model) and that solves all of our
| problems.
|
| It will likely always be a combination of the few most
| powerful levers to pull.
| Der_Einzige wrote:
| Correct, when I say "ChatGPT et al", I mean closed source
| paywalled LLMs, open access LLM personalization is an
| extreme gamechanger. All of what you mentioned is
| important, and I'm particularly excited about PyReft.
|
| https://github.com/stanfordnlp/pyreft
|
| Anything Christopher Manning touches turns to gold.
| curious_cat_163 wrote:
| The abstractions could be cleaner. I think some of the
| convolution is due to the evolution that it has undergone and
| core contributors have not come around to being fully "out with
| the old".
|
| I think there might be practical benefits to it. The XMC
| example illustrates it for me:
|
| https://github.com/KarelDO/xmc.dspy
| isaacbmiller wrote:
| Disclaimer: original blog author
|
| > _as opposed to solving an actual problem_
|
| This was literally the point of the post. No one really knows
| what the future of LLMs will look like, so DSPy just
| iteratively changes in the best way it can for your metric
| (your problem).
|
| > _someone actually using for something other than a toy
| example_
|
| DSPy, among the problems I listed in the post, has some
| scalability problems, too, but I am not going to take away from
| that. There are at least early signs of enterprise adoption
| from posts like this blog:
| https://www.databricks.com/blog/optimizing-databricks-llm-pi...
| isoprophlex wrote:
| The magic sauce seems to be, at every turn, "... if you have
| some well defined metric to optimize on."
|
| And that's not really a given, in reality. It allows all sorts
| of tricks to do what DSPy is aiming for, which you won't be
| able to do in real life.
|
| Unless I'm sorely mistaken, but that's my take on the whole
| thing.
| revskill wrote:
| Whenever i see "ChainOfThought" for AI, it's an annoying and
| misleading term. Machine never never thinks at all.
| fsndz wrote:
| I tried it recently and it is kinda fun:
| https://www.lycee.ai/courses/a5b7d115-c794-410d-92f2-15d8f29...
| gunalx wrote:
| Not to say anything about dspy, but I really liked the take on
| hvat we should use llms for.
|
| We need to stop doing useless reasoning stuff, and find acttual
| fitting problems for the llms to solve.
|
| Current llms are not your db manager(if they could be you don't
| have a db size in the real world). They are not a developer. We
| have people for that.
|
| Llms prove to be decent creative tools, classificators, and qna
| answer generators.
| thatsadude wrote:
| I had a few problems with DSPy:
|
| * Multi-hop reasoning rarely works with real data in my case. *
| Impossible to define advanced metrics over the whole dataset. *
| No async support
___________________________________________________________________
(page generated 2024-08-11 23:01 UTC)