[HN Gopher] Throw more AI at your problems
___________________________________________________________________
Throw more AI at your problems
Author : vsreekanti
Score : 59 points
Date : 2024-10-24 16:51 UTC (6 hours ago)
(HTM) web link (frontierai.substack.com)
(TXT) w3m dump (frontierai.substack.com)
| headcanon wrote:
| I'll stay out of the inevitable "You're just adding a band aid!
| What are you really trying to do?" discussion since I kind of see
| the author's point and I'm generally excited about applying LLMs
| and ML at more tasks. One thing I've been thinking about is if an
| agent (or collection of agents) can solve a problem initially in
| a non-scalable way through raw inference, but then develop code
| to make parts of the solution cheaper to run.
|
| For example, I want to scrape a collection of sites. The agent
| would at first apply the whole HTML to the context to extract the
| data (expensive but it works), but then there is another agent
| that sees this pipeline and says "hey we can write a parser for
| this site so each scrape is cheaper", and iteratively replaces
| that segment in a way that does not disrupt the overall task.
| malfist wrote:
| What do you mean the patient is bleeding out? We just need to
| use more bandaids!
| cyrillite wrote:
| Well, the standard advice for getting off the ground with most
| endeavours is "Do things that don't scale". Obviously scaling
| is nice, but sometimes it's cheaper and faster to brute force
| it and worry about the rest later.
|
| The unscalable thing is often like "buy it cheap, buy it twice"
| but it's also often like "buy it cheap, only fix it if you use
| it enough that it becomes unsuitable". Makers endorse both
| attitudes. Knowing when which applies is the challenging bit
| westoncb wrote:
| Cool idea. It's a bit like what happens in human brains when we
| develop expertise at something too: start with general purpose
| behaviors/thinking applied to new specialized task--but if that
| new specialized task is repeated/important enough you end up
| developing "specialized circuitry" for it: you can perform the
| task more efficiently, often without requiring conscious
| thought.
| crooked-v wrote:
| With the current state of "AI", this strikes me as a "I had a
| problem and used AI, now I have two problems" kind of situation
| in most cases.
| ToucanLoucan wrote:
| All the snark contained within aside, I'm reminded of that
| ranting blog post from the person sick of AI that made the
| rounds a little ways back, which had one huge, cogent point
| within: that the same companies that can barely manage to ship
| and maintain their current software are not magically going to
| overcome that organizational problem set by virtue of using
| LLMs. Once they add that in, then they're just going to have
| late-released, poorly made software that happens to have an LLM
| in it somewhere.
| WesleyLivesay wrote:
| I believe you mean this one:
| https://ludic.mataroa.blog/blog/i-will-fucking-piledrive-
| you...
| Mistletoe wrote:
| Some things should be bronzed so we don't ever lose them
| and this is one of them.
| throwaway19972 wrote:
| > Once they add that in, then they're just going to have
| late-released, poorly made software that happens to have an
| LLM in it somewhere.
|
| Hell, you can even predict where by looking at the lowest-
| paid people in the organization. Work that isn't valued today
| won't start being valued tomorrow.
| zonethundery wrote:
| It is too bad Erik Naggum did not live to see the AI era.
| paxys wrote:
| But see the company selling the AI "solution" made money, which
| is the entire point.
| satisfice wrote:
| Making money is the Silicon Valley Bro equivalent of
| mathematical proof.
|
| Sorry, not making money-- I meant attracting investment.
|
| AI tech is generally undertested and experimental, yet
| marketed as solid and reliable.
| rachofsunshine wrote:
| Speaking as someone who is very critical of the general
| "raise infinite money and grow grow grow" approach to
| startups (it's a big part of why my company is bootstrapped
| [1]), I think there's a really important difference between
| _making_ money and _raising_ money.
|
| _Raising_ money is about convincing investors that you can
| solve someone else 's problem. Or even more abstractly,
| about convincing investors that you can convince other
| investors that you can convince other investors that you
| can solve someone else's problem. That's enough layers of
| indirection that signaling games start to overtake concrete
| value a lot of the time, and that's what gets you your
| Theranoses and your FTXes and the like. You get into "the
| market can remain irrational longer than you can remain
| solvent", or its corollary, "the market can remain
| irrational long enough for you to exit before it wakes up".
|
| But _making_ money means you are solving a problem _for
| your customer_ , or at least, that your customer thinks
| you're solving a problem with no additional layers of
| indirection. And if we take "making money" to mean "making
| a profit", it also means you're solving their problem at
| less cost than the amount they're willing to pay to solve
| it. You are actually creating net value, at least within a
| sphere limited to you and your customer (externalities, of
| course, are a whole other thing, but those are just as
| operative in non-profitable companies).
|
| I think this is one of the worst things about the way
| business has done today. Doing business, sustainably and
| profitably, is an excellent way to keep yourself honest and
| force your theories to actually hold up in a competitive
| market. It's a good thing for you and for your users. But
| business has become so much about gathering sufficient
| capital to do wildly anticompetitive things and/or buy
| yourself preferential treatment that we're losing that
| regulating force of honesty.
|
| [1] see my HN profile for more on that if you care
| o11c wrote:
| Incidentally, this just led me to the first case I've found of
| a wide variety of AIs actually generating the same answer
| consistently. They almost always said a close synonym for "now
| I have a solution".
|
| For some AIs, if you ask them to complete it for "Java" and/or
| "Regexes" first, then they give realistic answers for "AI". But
| others (mostly, online commercial ones) are just relentlessly
| positive even then.
|
| Prompting to complete it for "Python" usually remains positive
| though.
| Stoids wrote:
| We aren't good at creating software systems from reliable and
| knowable components. A bit skeptical that the future of software
| is making a Rube Goldberg machine of black box inter-LLM
| communication.
| TZubiri wrote:
| I'm pretty sure this is a satire post
| pizza wrote:
| Here's a practical in this vein but much simpler - if you're
| trying to answer a question with an LLM, and have it answer
| in json format within the same prompt, for many models the
| accuracy is worse than just having it answer in plaintext.
| The reason is that you're now having to place a bet that the
| distribution of json strings it's seen before meshes nicely
| with the distribution of answers to that question.
|
| So one remedy is to have it just answer in plaintext, and
| then use a second, more specialized model that's specifically
| trained to turn plaintext into json. Whether this chain of
| models works better than just having one model all depends on
| the distribution match penalties accrued along the chain in
| between.
| from-nibly wrote:
| So developing solutions with ai is like trying to build
| stuff with family feud.
| paulddraper wrote:
| Not only is it the future of software, it's the past and
| present as well.
| dsv3099i wrote:
| As someone who makes HW for a living, please do make more Rube
| Goldberg machines of black box LLMs. At least for a few more
| years until my kids are out of college. :)
| mmcdermott wrote:
| I could see software having a future as a Rube Goldberg machine
| of black box AIs, if hardware is cheap enough and the AIs are
| good enough. There was a scifi novel (maybe "A Fire Upon the
| Deep"?) where there was no need to write software because AI
| could cobble any needed solution together by using existing
| software and gluing it together. Throwing cycles at deepening
| layers was also something that Paul Graham talked about in the
| hundred year language (https://paulgraham.com/hundred.html).
|
| Now, whether hardware is cheap enough or AI is smart enough is
| an entirely different question...
| l5870uoo9y wrote:
| RAG doesn't necessarily give the best results. Essentially it is
| a technically elegant way to semantic context to the prompt (for
| many use cases it is over-engineered). I used to offer RAG SQL
| query generations on SQLAI.ai and while I might introduce it
| again, for most use cases it was overkill and even made working
| with the SQL generator unpredictable.
|
| Instead I implemented low tech "RAG" or "data source rules". It's
| a list of general rules you can attach to a particular data
| source (ie database). Rules are included in the generations and
| work great. Examples are "Wrap tables and columns in quotes" or
| "Limit results to 100". It's simple and effective - I can execute
| the generate SQL again my DB for insights.
| trhway wrote:
| Reminds how few years ago Tesla (if i remember - Karpaty)
| described that in Autopilot they started to extract the 3rd
| model and use it to explicitly apply some static rules.
| simonw wrote:
| What do you mean by "RAG SQL query generations"? Were you
| searching for example queries similar to the questions the
| user's asked and injecting those examples into the prompt?
| com2kid wrote:
| RAG without preprocessing is almost useless, because unless you
| are careful, RAG gives you all the weaknesses of vector DBs,
| with all the weaknesses of LLMs!
|
| The easiest example I can come up is imagine you just dump a
| restaurant menu into a vector DB. The menu is from a hipster
| restaurant and instead of having "open hours" or "business
| hours" the menu says "Serving deliciousness between 10:00am and
| 5:00pm"
|
| Naive RAG queries are going to fail miserably on that menu.
| "When is the restaurant open?" "What are the business hours?"
|
| Longer context lengths are actually the solution for this
| problem, when context is small enough and the potential of
| ambiguity is high enough, LLMs are the better tool.
| fsndz wrote:
| Semantic search is a powerful tool that can greatly improve the
| relevance and quality of search results by understanding the
| intent and contextual meaning of search terms. However, it's
| not without its limitations. One of the key challenges with
| semantic search is the assumption that the answer to a query is
| semantically similar to the query itself. This is not always
| the case, and it can lead to less than optimal results in
| certain situations. https://fsndzomga.medium.com/the-problem-
| with-semantic-searc...
| mvdtnz wrote:
| Don't post AI slop on HN comments please.
| keeganpoppen wrote:
| YES (although i'm hesitant to even say anything because on some
| level this is tightly-guarded personal proprietary knowledge from
| the trenches that i hold quite dear). why aren't you spinning off
| like 100 prompts from one input? it works great in a LOT of
| situations. better than you think it does/would, no matter your
| estimation of its efficacy.
| pizza wrote:
| 100 prompts doing what? Something like more selective, focused
| extraction of structured fields?
| hggigg wrote:
| I wish this was funny but it's not. We are doing this now. It has
| become like "because it's got electrolytes" in our org.
| glial wrote:
| Well, at least it's not blockchain or Kubernetes.
| pqdbr wrote:
| The blockchain hype train was ridiculous. Textbook "solution
| looking for a problem" that every consultant was trying to
| push to every org, which had to jump onboard simply because
| of FOMO.
| fluoridation wrote:
| I don't think that's quite right. It was businesses who
| were jumping at consultants to see how they stuff a
| blockchain into their pipeline to do the same thing they
| were already doing, all so they could put "now with
| blockchain!" on the website.
| hggigg wrote:
| We did those already. So still not funny :)
| mvdtnz wrote:
| We are truly in the stupidest phase of software engineering yet.
| com2kid wrote:
| > This is where compound systems are a valuable framework because
| you can break down the problem into bite-sized chunks that
| smaller LLMs can solve.
|
| Just a reminder that smaller fine tuned models are just as good
| at solving the problems they are trained to solve, as large
| models are.
|
| > Oftentimes, a call to Llama-3 8B might be enough if you need to
| a simple classification step or to analyze a small piece of text.
|
| Even 3B param models are powerful now days, especially if you are
| willing to put the time into prompt engineering. My current side
| project is working on simulating a small fantasy town using a
| tiny locally hosted model.
|
| > When you have a pipeline of LLM calls, you can enforce much
| stricter limits on the outputs of each stage
|
| Having an LLM output a number from 1 to 10, or "error" makes your
| schema really hard to break.
|
| All you need to do is parse the output and it if isn't a number
| from 1 to 10... just assume it is garbage.
|
| A system built up like this is much more resilient, and also
| honestly more pleasant to deal with.
| fsndz wrote:
| More AI sauce won't hurt right ? Meanwhile, we still have to
| solve the ragallucination problem. https://www.lycee.ai/blog/rag-
| ragallucinations-and-how-to-fi...
| kylehotchkiss wrote:
| Use Moore's law to achieve unreal battery life and better
| experiences for users... or use Moore's law to throw more piles
| of abstractions on abstractions where we end up with solutions
| like Electron or I Duck Taped An AI on it.
|
| Reading through this, I could not tell if this was a parody or
| real. That robot image slopped in the middle certainly didn't
| help.
___________________________________________________________________
(page generated 2024-10-24 23:01 UTC)