[HN Gopher] Builder.ai did not "fake AI with 700 engineers"
       ___________________________________________________________________
        
       Builder.ai did not "fake AI with 700 engineers"
        
       Author : tanelpoder
       Score  : 50 points
       Date   : 2025-06-12 17:47 UTC (5 hours ago)
        
 (HTM) web link (newsletter.pragmaticengineer.com)
 (TXT) w3m dump (newsletter.pragmaticengineer.com)
        
       | cratermoon wrote:
       | Unnamed former employees of a dead company say company didn't
       | fake it. Film at 11.
        
         | alephnerd wrote:
         | I tend to trust Gergely Orosz (the writer of Pragmatic
         | Engineer). He validate sources and has a good track record on
         | reporting on the European tech scene and Engineering
         | Management.
         | 
         | His blog and newsletter are both fairly popular on HN.
        
         | senko wrote:
         | This was analyzed on HN a week or so ago:
         | https://news.ycombinator.com/item?id=44176241
         | 
         | The "700 engineers faking AI" claim seems to have been
         | sloppy[0] reasoning by an influencer, which spread like
         | wildfire.
         | 
         | [0] I won't attribute malice here, but this version was
         | certainly more interesting than the truth
        
         | mediaman wrote:
         | The original story doesn't make any sense. How would you fake
         | an "AI" agent coding by using people on the other side? Woudn't
         | it be...obvious? People cannot type code that fast.
         | 
         | What's your non-snarky theory about how this could possibly be
         | true?
        
           | ceejayoz wrote:
           | You claim you have a queue and it takes up to 24 hours for
           | your job to run?
        
           | apwell23 wrote:
           | It was obviously not prompt and get response model like
           | chatgpt.
        
       | wnevets wrote:
       | Are there people who actually believe that a user would enter a
       | text prompt than a human programmer would generate the code?
        
         | tomasphan wrote:
         | Yes, 90% of people with no tech background reading the news
        
         | TiredOfLife wrote:
         | Majority of HN commenters
        
         | apwell23 wrote:
         | that was not the flow
        
         | dd_xplore wrote:
         | Unfortunately a lot of people!!
        
         | hluska wrote:
         | Builder.ai had a totally different flow, but yeah, when boring
         | stories and exciting ones compete to tell the same story, a
         | very large percentage will run with the exciting story. It's
         | like death tax in US political history - the US has never had a
         | death tax but it's way more exciting to call it a death tax
         | than an estate tax. Only now, instead of media being the
         | primary disseminator of spin, we have people sharing exciting
         | stories on social media instead of boring stories about
         | building internal zoom and accounting issues.
         | 
         | Then social animals kick in, likes pour in and more people
         | share. Social media has created a world where an exciting lie
         | can drown out boring truth for a large percentage of people.
        
         | DebtDeflation wrote:
         | My assumption when the story broke was that the 700 engineers
         | were using various AI tools (Replit, Cursor, ChatGPT, etc.) to
         | create code and documentation and then stitching it all
         | together somewhat manually. Sort of like that original Devin
         | demo where AI was being used at each step but there was a ton
         | of manual intervention along the way and the final video was
         | edited to make it seem as if the whole thing ran end to end
         | fully automated all from the initial prompt.
        
         | TuringNYC wrote:
         | I worked with an "AI data vendor" at work where you'd put in a
         | query and "the AI gave you back a dataset" but it usually took
         | 24hrs, so it was obvious they had humans pulling the data. The
         | company still purchased a data plan. It happens, in this case,
         | they have a unique dataset, though.
        
       | mellosouls wrote:
       | Kudos to the author for the update - and also to others including
       | @dang for calling it out at the time:
       | 
       | https://news.ycombinator.com/item?id=44169759
       | 
       |  _(Builder.ai Collapses: $1.5B 'AI' Startup Exposed as
       | 'Indians'?, 367 points, 267 comments)_
        
       | tomasphan wrote:
       | I don't believe that their business entirely depended on 700
       | actual humans, just as much as I don't believe that to be true
       | for the Amazon store. However, both probably relied on humans in
       | the loop which is not sustainable at scale.
        
         | fragmede wrote:
         | at what scale though? as long as money line go up faster than
         | cost line go up, it's fine?
        
         | Legend2440 wrote:
         | If you read the article, they had two separate products: one of
         | which was 700 actual humans, and the other was an LLM-powered
         | coding tool.
        
       | gamblor956 wrote:
       | LLMs are all fake AI. As the recently released Apple study
       | demonstrates, LLMs don't reason, they just pattern match. That's
       | not "intelligence" however you define it because they can only
       | solve things that are already within their training set.
       | 
       | In this case, it would have been better for the AI industry if it
       | had been 700 programmers, because then the rest of the industry
       | could have argued that the utter trash code Builder.ai generated
       | was the result of human coders spending a few minutes haphazardly
       | typing out random code, and not the result of a specialty-trained
       | LLM.
        
         | aeve890 wrote:
         | >As the recently released Apple study demonstrates, LLMs don't
         | reason, they just pattern match
         | 
         | Hold on a minute I was under the impression that "reasoning"
         | was just marketing buzzword the same as "hallucinations",
         | because how tf anyone expected GPUs to "reason" and
         | "hallucinate" when even neurology/psychology don't have a
         | strict definition of those processes.
        
           | jacobr1 wrote:
           | No, the definitions are very much up for debate, but there is
           | an actual process here. "Reasoning" in this case means having
           | the model not just produce whatever output is requested
           | directly, but also spend some time writing out its thoughts
           | about how to produce that output. Early version of this were
           | just prompt engineering where you ask the model to produce
           | its "chain of thought" or "work step by step" on how to
           | approach the problem. Later this was trained into the model
           | directly with traces on this intermediate thinking,
           | especially for multistep problems, without the need for
           | explicit prompting. And then architecturally these models now
           | have different ways to determine when to stop "reasoning" to
           | skip to generating actual output.
           | 
           | I don't have a strict enough definition to debate if this
           | reasoning is "real" - but from personal experience it
           | certainly appears to be performing something that at least
           | "looks" like inductive thought, and leads to better answers
           | than prior model generations without reasoning/thinking
           | enabled.
        
             | codr7 wrote:
             | Reasoning means what reasoning always meant.
             | 
             | Selling an algorithm that can write a list of steps as
             | reasoning is bordering on fraud.
             | 
             | It's not uncommon that they guess the right solution, and
             | then "reason" their way out of it.
        
             | klank wrote:
             | It's gradient descent. Why are we surprised when the
             | answers get better the more we do it? Sometimes you're
             | stuck in a local max/minima, and you hallucinate.
             | 
             | Am I oversimplifying it? Is everybody else over-mystifying
             | it?
        
         | throwaway314155 wrote:
         | > As the recently released Apple study demonstrates, LLMs don't
         | reason
         | 
         | Where is everyone getting this misconception? I have seen it
         | several times. First off, the study doesn't even try to qualify
         | whether or not these models use "actual reasoning" - that's
         | outside of the scope. They merely examine how effective
         | thinking/reasoning _is_ at producing better results. They found
         | that - indeed - reasoning improves performance. But the crucial
         | result is that it only improves performance up to a certain
         | difficulty-cliff - at which point thinking makes no discernable
         | difference due to a model collapse of sorts.
         | 
         | It's important to read the papers you're using to champion your
         | personal biases.
        
         | UebVar wrote:
         | > because they can only solve things that are already within
         | their training set.
         | 
         | That is just plain wrong, as anybody who spent more than 10
         | minutes with a LLM within the last 3 years can attest. Give it
         | a try, especially if you care to have an opinion on them. Ask
         | an absurd question (that can be, in principle, answered) that
         | nobody has asked before and see how it performs generalizing.
         | The hype is real.
         | 
         | I'm interested what study you refer to. Because I'm interested
         | in their methods and what they actually found out.
        
           | spion wrote:
           | What you think is an absurd question may not be as absurd as
           | it seems, given the trillions of tokens of data on the
           | internet, including its darkest corners.
           | 
           | In my experience, its better to simply try using LLMs in
           | areas where they don't have a lot of training data (e.g.
           | reasoning about the behaviour of terraform plans). Its not a
           | hard cutoff of being _only_ able to reason exactly about
           | solved things, but its not too far off as a first
           | approximation.
           | 
           | The researchers took exiting known problems and parameterised
           | their difficulty [1]. While most of these are not by any
           | means easy for humans, the interesting observation to me was
           | that the failure_N was not proportional to the complexity of
           | the problem, but more with how common solution "printouts"
           | for that size of the problem can typically be encountered in
           | the training data. For example, "towers of hanoi" which has
           | printouts of solutions for a variety of sizes went to very
           | large number of steps N, while the river crossing, which is
           | almost entirely not present in the training data for N larger
           | than 3, failed above pretty much that exact number.
           | 
           | [1]: https://machinelearning.apple.com/research/illusion-of-
           | think...
        
             | CSSer wrote:
             | It doesn't help that thanks to RLHF, every time a good
             | example of this gains popularity, e.g. "How many Rs are in
             | 'strawberry'?", it's often snuffed out quickly. If I worked
             | at a company with an LLM product, I'd build tooling to look
             | for these kinds of examples in social media or directly in
             | usage data so they can be prioritized for fixes. I don't
             | know how to feel about this.
             | 
             | On the one hand, it's sort of like red teaming. On the
             | other hand, it clearly gives consumers a false sense of
             | ability.
        
           | jvanderbot wrote:
           | "The apple study" is being overblown too, but here it is:
           | https://machinelearning.apple.com/research/illusion-of-
           | think...
           | 
           | The crux is that beyond a bit of complexity the whole house
           | of cards comes tumbling down. This is trivially obvious to
           | any user of LLMs who has trained _themselves_ to use LLMs (or
           | LRMs in this case) to get better results ... the usual  "But
           | you're prompting it wrong" answer to any LLM skepticism.
           | Well, that's definitely true! But it's also true that these
           | aren't magical intelligent subservient omniscient creatures,
           | because that would imply that they would learn how to work
           | with _you_. And before you say  "moving goalpost" remember,
           | this is _essentially_ what the world thinks they are being
           | sold.
           | 
           | It can be both breathless hysteria _and_ an amazing piece of
           | revolutionary and useful technology at the same time.
           | 
           | The training set argument is just a fundamental
           | misunderstanding, yes, but you should think about the
           | contrapositive - can an LLM do well on things that are
           | _inside_ its training set? This paper does use examples that
           | are present all over the internet including solutions. Things
           | children can learn to do well. Figure 5 is a good figure to
           | show the collapse in the face of complexity. We've all seen
           | that when tearing through a codebase or trying to "remember"
           | old information.
        
             | tough wrote:
             | I think apple published that study right before WWDC to
             | have an excuse to not give bigger than 3B foundation models
             | locally and force you to go via their cloud -for reasoning-
             | harder tasks.
             | 
             | beta api's so its moving waters but that's my thoughts
             | after playing with it, the paper makes much more sense in
             | that context
        
         | ChrisMarshallNY wrote:
         | _> because they can only solve things that are already within
         | their training set_
         | 
         | I just gave up on using SwiftUI for a rewrite of a backend
         | dashboard tool.
         | 
         | The LLM didn't give up. It kept suggesting wilder, and less
         | stable ideas, until I realized that this was a rabbithole full
         | of misery, and went back to UIKit.
         | 
         | It wasn't the LLM's fault. SwiftUI just isn't ready for the
         | particular functionality I needed, and I guess that a day of
         | watching ChatGPT get more and more desperate, saved me a lot of
         | time.
         | 
         | But the LLM didn't give up, which is maybe ot-nay oo-tay ight-
         | bray.
         | 
         | https://despair.com/cdn/shop/files/stupidity.jpg
        
         | meowface wrote:
         | AI skepticism is like a religion at this point. Weird it's so
         | prominent on a tech site.
         | 
         | (The Apple paper has had many serious holes poked in it.)
        
           | b00ty4breakfast wrote:
           | Well, if the thing is truly capable of reason, then we have
           | an obligation to put the kibosh on the entire endeavor
           | because we're using a potentially intelligent entity as slave
           | labor. At best, we're re-inventing factory farming and at
           | worst we're re-inventing chattel slavery. Neither of those
           | situations is something I'm personally ok with allowing to
           | continue
        
             | klank wrote:
             | I concur.
             | 
             | I also find the assumption that tech-savvy individuals
             | would inherently be for what we currently call AI to
             | itself, be weird. Unfortunately I feel as though being
             | knowledgable or capable within an area is conflated with an
             | over-acceptance of that area.
             | 
             | If anything, the more I've learned about technology, and
             | the more experienced I am, the more fearful and cautious I
             | am with it.
        
       | alerter wrote:
       | > Builder hired 300 internal engineers and kicked off building
       | internal tools, all of which could have simply been purchased
       | 
       | Tempted to say there was a bit of corruption here, crazy
       | decision. Like someone had connections to the contractor
       | providing all those devs.
       | 
       | otoh they were an "app builder" company. Maybe they really wanted
       | to dogfood.
        
         | alephnerd wrote:
         | A similar thing happened at Uber before the 2021 re-org. At one
         | point they had 3 competing internal chat apps from what I've
         | heard from peers working there, and having previously worked
         | for a vendor of Uber's, I noticed a significant amount of
         | disjointedness in their environment (seemed very disjointed EM
         | driven with no overarching product vision).
         | 
         | Ofc, Gergely might have some thoughts about that ;)
        
       | quantadev wrote:
       | I always knew this story was fake. Even if you have a trillion
       | expert developers it would still be impossible to get fast enough
       | answers to "Fake an LLM". Humans obviously aren't
       | _parallelizable_ like that.
        
       | firesteelrain wrote:
       | " building internal versions of Slack, Zoom, JIRA, and more..."
       | 
       | Did they really do this or customize Jira schemas and workflows
       | for example ?
        
       | Fraterkes wrote:
       | The headline as stated is categorically false, buuuut... I think
       | it's pretty salient that a company called "Builder.ai" only had
       | 15 engineers working on actual ai and actually mostly functioned
       | as an outsourcing intermediary for 500-1000 engineers (ie, the
       | builders). When it comes to these viral misunderstandings, you
       | kind of reap what you sow.
        
       | stego-tech wrote:
       | > Builder hired 300 internal engineers and kicked off building
       | internal tools, all of which could have simply been purchased
       | 
       | Dear god, _PLEASE_ hire an actual Enterprise IT professional
       | early in your startup expansion phase. A single competent EIT
       | person (or dinosaur like me) could have - if this story is true -
       | possibly saved the whole startup by understanding what's
       | immediately needed versus what's nice-to-have, what should be
       | self-hosted versus what should be XaaS, stitching everything
       | together to reduce silos, and ensuring every cent is not just
       | accounted for but wisely invested in future success.
       | 
       | Even if the rest of your startup isn't "worrying about the
       | money", your IT and Finance people should _always_ be worried
       | about the money.
        
       ___________________________________________________________________
       (page generated 2025-06-12 23:00 UTC)