[HN Gopher] Tenstorrent and the State of AI Hardware Startups
       ___________________________________________________________________
        
       Tenstorrent and the State of AI Hardware Startups
        
       Author : zdw
       Score  : 210 points
       Date   : 2024-12-15 02:59 UTC (20 hours ago)
        
 (HTM) web link (irrationalanalysis.substack.com)
 (TXT) w3m dump (irrationalanalysis.substack.com)
        
       | alkh wrote:
       | Kudos to the Tenstorrent team for being so open to having a
       | discussion. I wish more companies would be like that, as
       | constructive criticism is very useful, especially for a startup
        
       | Mathnerd314 wrote:
       | I almost forgot about that ARM-Qualcomm dispute, the fireworks
       | are only a few days away.
        
       | latchkey wrote:
       | I have a business renting high performance compute. I want to
       | democratize compute to make it more easily available on short
       | term basis to anyone who wants access. I talked to TT about a
       | year ago; also one of their customers as well. I was really
       | impressed with everyone I talked to and I'd love to work with
       | them.
       | 
       | What I realized though is that as much as I'd like to buy and
       | host it and make it available, I'm not sure the economics or
       | interest are there yet. The focus today is so dedicated to Nvidia
       | that "fringe" gear is still just that. People who really want to
       | play with it, will just buy it themselves. They've probably been
       | doing that with hardware for a long time.
       | 
       | So, it is a bit of a catch-22 for me right now. Hopefully
       | interest grows in these sorts of offerings and demand will pick
       | up. The world needs alternatives to just Nvidia dominance of all
       | of AI hardware and software.
        
         | choppaface wrote:
         | Can you sell or re-sell colo space to a handful of customers
         | who might put TT or other "weird" hardware there? Doesn't
         | scale, but it hedges your own business. Requires the right
         | customer though, like somebody who might buy Nervana / Xeon Phi
         | but then buy NVidia from you when it blows up.
        
           | latchkey wrote:
           | Kind of. It is easier for us to be the capex/opex (meaning
           | buy and run equipment) and then rent it out. I have the
           | backing to make large investments without too much drama. We
           | can do that with any hardware as long as we have longer (1-2
           | year) contracts in place. We have the space/power available
           | in a top tier data center (Switch).
           | 
           | https://hotaisle.xyz/cluster/
        
         | kouteiheika wrote:
         | > Hopefully interest grows in these sorts of offerings and
         | demand will pick up.
         | 
         | Well, looking at their (as far as I can see highest end)
         | accelerator the n300s we get:
         | 
         | - 24GB of memory
         | 
         | - 576GB/s of memory bandwidth
         | 
         | - $1400
         | 
         | As a hobbyist this is still not compelling enough to get
         | excited and port my software - same amount of memory as a
         | 4090/3090, half the bandwidth as a 4090/3090, slightly cheaper
         | (than a 4090), more expensive (than a 3090), much worse
         | software support. Why would I buy it over NVidia? This might be
         | more compelling to bigger fish customers who would buy
         | thousands of these (so then the lower price makes a
         | difference), but you really need small fish people to use it
         | too if you want to achieve good, widespread software support.
         | 
         | However if they'd at least double the amount of memory at the
         | same price, now we'd be talking...
        
           | A_D_E_P_T wrote:
           | Yeah, clearly the low-hanging fruit is simply to offer more
           | memory.
           | 
           | In a recent thread, just about everybody was talking about
           | how Intel should have released its "Battlemage" line of cards
           | with 48GB+ and how they'd actually have a very compelling
           | offering on their hands had they done so.
           | https://news.ycombinator.com/item?id=42308590
           | 
           | Missed opportunities.
           | 
           | That said, I know some of the Tenstorrent guys and they're
           | extremely smart and capable. They'll figure it out, and the
           | company is probably going to 10x.
        
             | kouteiheika wrote:
             | > Yeah, clearly the low-hanging fruit is simply to offer
             | more memory.
             | 
             | Yep. And just to be clear, this isn't necessarily a
             | strategy to directly make money, as the market for people
             | running these things locally is probably not very big (as a
             | lot of people in the Intel thread said). But it's a
             | strategy to beat the CUDA monopoly and get everyone and
             | their dog to support your hardware with their software. I
             | know I would be porting my software to their hardware if it
             | was actually compelling for small fish, and I know plenty
             | of other people who would too.
             | 
             | AMD also somewhat falls into this trap. They're slightly
             | cheaper than NVidia, and their hardware is roughly good
             | enough, but because their software stack and their hardware
             | support sucks (just look at their ROCm compatibility list,
             | it's a complete joke with, like, 3 models of GPUs, vs
             | NVidia where I can use any GPU from the last 8 years) no
             | one bothers. Being good enough is not good enough, you need
             | to offer some sort of a unique selling point for people to
             | tolerate the extra software issues they'll have with your
             | hardware.
        
               | Workaccount2 wrote:
               | If someone built a mediocre GPU (~4070 level) with 48GB
               | or 96GB, then the community would build the software
               | stack for you.
               | 
               | Granted, you would not own that software and it would be
               | ported to other cards in the future, but if you are
               | trying to topple the king (nvidia) it would be a powerful
               | strat.
        
         | steeve wrote:
         | Look at what we (ZML) are doing: https://github.com/zml/zml
         | 
         | Any model, any hardware, zero compromise. TT support is on the
         | roadmap and we talk to them.
        
           | thangngoc89 wrote:
           | Do you have any plan to allow conversion from Pytorch to your
           | format?
        
       | C-programmer wrote:
       | > I genuinely believe Groq is a fraud. There is no way their
       | private inference cloud has positive gross margins.
       | 
       | > Llama 3.1 405B can currently replace junior engineers
       | 
       | I'd like more exposition on these claims.
        
         | refulgentis wrote:
         | Today, I wrote a full YouTube subtitle downloader in Dart. 52
         | minutes from starting to google anything about it, to full
         | implementation and tests, custom formatting any of the 5
         | obscure formats it could be in to my exact whims. Full coverage
         | of any validation errors via mock network responses.
         | 
         | I then wrote a web AudioWorklet for playing PCM in 3 minutes,
         | which complied to the same interface as my Mac/iOS/Android
         | versions, ex. Setting sample rate, feedback callback, etc. I
         | have no idea what an AudioWorklet is.
         | 
         | Two days ago, I stubbed out my implementation of OpenAI's web
         | socket based realtime API, 1400 LOC over 2 days, mostly by hand
         | while grokking and testing the API. In 32 minutes, I had a
         | brand spanking new batch of code, clean, event-based
         | architecture, 86% test coverage. 1.8 KLOC with tests.
         | 
         | In all of these cases, most I needed to do was drop in code
         | files and say, nope wrong a couple times to sonnet, and say
         | "why are you violating my service contract and only providing
         | an example solution" to o1.
         | 
         | Not llama 3.1 405B specifically, I haven't gone to the trouble
         | of running it, but things turned some sort of significant
         | corner over the last 3 months, between o1 and Sonnet 3.5.
         | Mistakes are rare. Believable 405B is on that scale, IIRC it
         | went punch for punch with the original 3.5 Sonnet.
         | 
         | But I find it hard to believe a Google L3, and third of L4s,
         | (read: new hires, or survived 3 years) are that productive and
         | sending code out for review at a 1/5th of that volume, much
         | less on demand.
         | 
         | So insane-sounding? Yes.
         | 
         | Out there? Probably, I work for myself now. I don't have to
         | have a complex negotiation with my boss on what I can use and
         | how. And I only saw this starting ~2 weeks ago, with full o1
         | release.
         | 
         | Wrong? Shill? Dilletante?
         | 
         | No.
         | 
         | I'm still digesting it myself. But it's real.
        
           | nightski wrote:
           | Most software is not one off little utilities/scripts,
           | greenfield small projects, etc... That's where LLMs excel,
           | when you don't have much context and can regurgitate
           | solutions.
           | 
           | It's less to do with junior/senior/etc.. and more to do with
           | the types of problems you are tackling.
        
             | spiderfarmer wrote:
             | Most software is simple HTML.
        
               | mhh__ wrote:
               | This isn't where the leverage is though
        
               | spiderfarmer wrote:
               | No, but AI will replace a lot of web developers.
        
             | refulgentis wrote:
             | This is a 30KLOC 6 platform flutter app that's, in this
             | user story, doing VOIP audio, running 3 audio models on-
             | device, including in your browser. A near-replica of the
             | Google Assistant audio pipeline, except all on-device.
             | 
             | It's a real system, not kindergarten "look at the React
             | Claude Artifacts did, the button makes a POST request!"
             | 
             | The 1500 loc websocket / session management code it
             | refactored and tested touches on nearly every part of the
             | system (i.e. persisting messages, placing search requests,
             | placing network requests to run a chat flow)
             | 
             | Also, it's worth saying this bit a bit louder: the "just
             | throwing files in" I mention is key.
             | 
             | With that, the quality you observed being in reverse is the
             | distinction: with o1 thinking, and whatever Sonnet's magic
             | is, there's a higher payoff from working a _larger_
             | codebase.
             | 
             | For example, here, it knew exactly what to do for the web
             | because it already saw the patterns iOS/Android/macOS
             | shared.
             | 
             | The bend I saw in the curve came from being ultra lazy one
             | night and seeing what would happen if it just had all the
             | darn files.
        
               | achierius wrote:
               | I've definitely noticed the opposite on larger codebases.
               | It's able to do magical things on smaller ones but really
               | starts to fall apart as I scale up.
        
               | noch wrote:
               | > This is a 30KLOC 6 platform flutter app [...] It's a
               | real system, not kindergarten "look at the React Claude
               | Artifacts did, the button makes a POST request!"
               | 
               | This is powerful and significant but I think we need to
               | ground ourselves on what a skilled programmer means when
               | he talks about solving problems.
               | 
               | That is, honestly ask: What is the level of skill a
               | programmer requires to build what you've described?
               | Mostly, imo, the difficulties in building it are in
               | platform and API details not in any fundamental
               | engineering problem that has to be solved. What makes web
               | and Android programming so annoying is all the
               | abstractions and frameworks and cruft that you end up
               | having to navigate. Once you've navigated it, you haven't
               | really solved anything, you've just dealt with obstacles
               | other programmers have put in your way. The solutions are
               | mostly boilerplate-like and the code I write is glue.
               | 
               | I think the definition of "junior engineer" or "simple
               | app" will be defined by what LLMs can produce and so, in
               | a way, unfortunately, the goal posts and skill ceiling
               | will keep shifting.
               | 
               | On the other, hand, say we watch a presentation by the
               | Lead Programmer at Naughty Dog, "Parallelizing the
               | Naughty Dog Engine Using Fibers"[^0] and ask the same
               | questions: what level of skill is required to solve the
               | problems he's describing (solutions worth millions of
               | dollars because his product has to sell that much to have
               | good ROI):
               | 
               | "I have a million LOC game engine for which I need to
               | make a scheduler with no memory management for
               | multithreaded job synchronization for the PS4."
               | 
               | A lot of these guys, if you've talked to them, are often
               | frustrated that LLMs simply can't help them make headway
               | with, or debug, these hard problems where novel hardware-
               | constrained solutions are needed.
               | 
               | ---
               | 
               | [^0]: https://www.youtube.com/watch?v=HIVBhKj7gQU
        
               | refulgentis wrote:
               | It's been pretty hard, but if you reduce it to "Were you
               | using a framework, or writing one that needs to push the
               | absolute limits of performance?"...
               | 
               | ...I guess the first?...
               | 
               | ...But not really?
               | 
               | I'm not writing GPU kernels or operating system task
               | schedulers, but I am going to some pretty significant
               | lengths to be running ex. local LLM, embedding model,
               | Whisper, model for voice activity detection, model for
               | speaker counting, syncing state with 3 web sockets.
               | Simultaneously. In this case, Android and iOS are no
               | valhalla of vapid-stackoverflow-copy-pasta-with-no-
               | hardware-constraints, as you might imagine.
               | 
               | And the novelty is, 6 years ago, I would have targeted
               | iOS and prayed. Now I'm on every platform at top-tier
               | speeds. All that boring tedious scribe-like stuff that
               | 90% of us spend 80% of our time on, is gone.
               | 
               | I'm not sure there's very many people at all who get to
               | solve novel hardware-constrained problems these days, I'm
               | quite pleased to brush shoulders with someone who brushes
               | shoulders with them :)
               | 
               | Thus, smacks more of no-true-scotsman than something I
               | can chew on. Productivity gains are productivity gains,
               | and these are no small productivity gain in a highly
               | demanding situation.
        
               | noch wrote:
               | > Thus, smacks more of no-true-scotsman than something I
               | can chew on.
               | 
               | I wasn't making a judgement about you or your work, after
               | all I don't know you. I was commenting within the context
               | of an app that you described for which an LLM was useful,
               | relative to the hard problems we'll need help with if we
               | want to advance technology (that is, make computers do
               | more powerful things and do them faster). _I have no idea
               | if you 're a true Scotsman or not_.
               | 
               | Regardless: over the coming years we'll find out who the
               | true Scotsmen were, as they'll be hired to do the stuff
               | LLMs can't.
        
               | geerlingguy wrote:
               | The challenging projects I've worked on are challenging
               | not because slamming out code to meet requirements is
               | hard (or takes long).
               | 
               | It's challenging because working to get a stable set of
               | requirements requires a lot of communication with end
               | users, stakeholders, etc. Then, predicting what they
               | actually mean when implementing said requirements. Then,
               | demoing the software and translating their non-technical
               | understanding and comments back into new requirements
               | (rinse and repeat).
               | 
               | If a tool can help program some of those requirements
               | faster, as long as it meets security and functional
               | standards, and is maintainable, it's not a big deal
               | whether a junior dev is working with Stack Exchange or
               | Claude, IMO. But I do want that dev to understand the
               | code being committed, because otherwise security bugs and
               | future maintenance headaches creep in.
        
             | dheera wrote:
             | I think most software outside of the few Silicon Valleys of
             | the world is in fact a bunch of dirty hacks put together.
             | 
             | I fully believe recursive application of current SOTA LLMs
             | plus some deployment framework can replace most software
             | engineers who work in the cornfields.
        
             | riwsky wrote:
             | No true Botsman would replace a junior eng!
        
           | xbmcuser wrote:
           | I agree with you that they are improving not being a
           | programer I can't tell if the code has improved but as user
           | that uses chat gpt or Google gemini to build scripts or
           | trading view indicators. I am seeing some big improvements
           | and many times wording it better detail and restricting it
           | from going of tangent results in working code.
        
           | tonetegeatinst wrote:
           | YouTube dlp has a subtitle option. To quote the
           | documentation: "--write-sub Write subtitle file --write-auto-
           | sub Write automatically generated subtitle file (YouTube
           | only) --all-subs Download all the available subtitles of the
           | video --list-subs List all available subtitles for the video
           | --sub-format FORMAT Subtitle format, accepts formats
           | preference, for example: "srt" or "ass/srt/best" --sub-lang
           | LANGS Languages of the subtitles to download (optional)
           | separated by commas, use --list-subs for available language
           | tags"
        
             | jiggawatts wrote:
             | Something I noticed is that as the threshold for doing
             | something like "write software to do X" decreases, the
             | tendency for people to go search for an existing product
             | and download it tends to zero.
             | 
             | There is a point where in some sense it is less effort to
             | just write the thing yourself. This is the argument against
             | micro-libraries as seen in NPM as well, but my point is
             | that the _threshold_ of complexity for  "write it yourself"
             | instead of reaching for a premade thing changes over time.
             | 
             | As languages, compilers, tab-complete, refactoring, and AI
             | assistance get better and better, eventually we'll reach a
             | point where the human race as a whole will be spitting out
             | code at an unimaginable rate.
        
             | zkry wrote:
             | This is a key point and one of the reasons why I think LLMs
             | will fall short of expectation. Take the saying "Code is a
             | liability," and the fact that with LLMs, you are able to
             | create so much more code than you normally would:
             | 
             | The logical conclusion is that projects will balloon with
             | code pushing LLMs to their limit, and this massive amount
             | is going to contain more bugs and be more costly to
             | maintain.
             | 
             | Anecdotally, supposedly most devs are using some form of AI
             | for writing code, and the software I use isn't magically
             | getting better (I'm not seeing an increased rate of
             | features or less buggy software).
        
               | trollbridge wrote:
               | My biggest challenges in building a new application and
               | maintaining an existing one is lack of good unit tests,
               | functional tests, and questionable code coverage; lack of
               | documentation; excessively byzantine build and test
               | environments.
               | 
               | Cranking out yet more code, though, is not difficult (and
               | junior programmers are cheap). LLMs do truly produce code
               | like a (very bad) junior programmer: when trying to make
               | unit tests, it takes the easiest path and makes something
               | that passes but won't catch serious regressions.
               | Sometimes I've simply reprompted it with "Please write
               | that code in a more proper, Pythonic way". When it comes
               | to financial calculations around dates, date intervals,
               | rounding, and so on, it often gets things just ever-so-
               | slightly wrong, which makes it basically useless for
               | financial or payroll type of applications.
               | 
               | It also doesn't help much with the main bulk of my (paid)
               | work these days, which is migrating apps from some old
               | platform like vintage C#-x86 or some vendor thing like a
               | big pile of Google AppScript, Jotform, Xapier, and so on
               | into a more maintainable and testable framework that
               | doesn't depend on subscription cloud services. So far I
               | can't find a way to make LLMs productive at all at that -
               | perhaps that's a good thing, since clients still see fit
               | to pay decently for this work.
        
               | refulgentis wrote:
               | I don't understand - why does the existence of a CLI tool
               | mean we're risking a grey goo situation if an LLM helps
               | produce Dart code for my production Flutter app?
               | 
               | My guess is you're thinking I'm writing duplicative code
               | for the hell of it, instead of just using the CLI tool -
               | no. I can't run arbitrary binaries, at all, on at least 4
               | of the 6 platforms.
               | 
               | Beyond that, that's why we test.
        
               | zkry wrote:
               | Apologies if it looked like I was singling out your
               | comment. It was more that those comments brought the idea
               | to mind that sheer code generation without skilled
               | thought directing it may lead to unintended negative
               | outcomes.
        
           | az226 wrote:
           | Show the code
        
             | JTyQZSnP3cQGa8B wrote:
             | That's why I'm annoyed: they never show the code.
        
               | refulgentis wrote:
               | You gotta lower your expectations. :) I didn't see the
               | comment till now, OP made it at 3:30 AM my time.
               | 
               | Here you go - https://pastebin.com/8zdMDEnG
        
               | IshKebab wrote:
               | I think they meant the code that the LLM generated.
        
               | refulgentis wrote:
               | Right: that is it.
        
               | IshKebab wrote:
               | This has clearly been heavily edited by humans.
        
               | refulgentis wrote:
               | I'm happy to provide whatever you ask for. With utmost
               | deference, I'm sure you didn't mean anything by it and
               | were just rushed, but just in case...I'd just ask that
               | you'd engage with charity[^3] and clarity[^2] :)
               | 
               | I'd also like to point out[^1] --- meaning, I gave it _my
               | original code_ , in toto. So of _course_ you 'll see ex.
               | comments. Not sure what else contributed to your
               | analysis, that's where some clarity could help me, help
               | you.
               | 
               | [^1](https://news.ycombinator.com/item?id=42421900) "I
               | stubbed out my implementation of OpenAI's web socket
               | based realtime API, 1400 LOC over 2 days, mostly by hand
               | while grokking and testing the API. In 32 minutes, I had
               | a brand spanking new batch of code, clean, event-based
               | architecture, 86% test coverage. 1.8 KLOC with tests."
               | 
               | [^2](https://news.ycombinator.com/newsguidelines.html)
               | "Be kind...Converse curiously; don't cross-examine."
               | 
               | [^3](https://news.ycombinator.com/newsguidelines.html)
               | "Please respond to the strongest plausible interpretation
               | of what someone says, not a weaker one that's easier to
               | criticize. Assume good faith."
        
               | Workaccount2 wrote:
               | I learned over 20 years ago to _never_ post your code
               | online for programmers to critique. Never. Unless you are
               | an absolute pro (at which point you wouldn't even be
               | asking for review), never do it.
        
           | lz400 wrote:
           | I don't understand what you guys are doing. For me sonnet is
           | great when I'm starting with a framework or project but as
           | soon as I start doing complicated things it's just wrong all
           | the time. Subtly wrong, which is much worse because it looks
           | correct, but wrong.
        
           | idiocache wrote:
           | Can you briefly describe your work flow? Are you exchanging
           | information with Sonnet in your IDE?
        
         | Workaccount2 wrote:
         | Not Llama but with Sonnet and O1 I wrote a bespoke android app
         | for my company in about 8 hours of work. Once I polish it a bit
         | (make a prettier UI), I'm pretty sure I could sell it to other
         | companies doing our kind of work.
         | 
         | I am not a programmer, and I know C and Python at about a 1 day
         | crash course level (not much at all).
         | 
         | However with sonnent I was able to be handheld all the way from
         | downloading android studio to a functional app written in
         | kotlin, that is now being used by employees on the floor.
         | 
         | People can keep telling themselves that LLMs are useless or
         | maybe just helpful for quickly spewing boilerplate code, but I
         | would heed the warning that this tech is only going to improve
         | and already helping people forgo SWE's very seriously. Sears
         | thought the internet was a cute party trick, and that obviously
         | print catalogs were there to stay.
        
         | throwawaymaths wrote:
         | > groq
         | 
         | i went to a groq event and one of their engineers told me they
         | were running 7 racks!! of compute per (70b?) model. that was
         | last year so my memory could be fuzzy.
         | 
         | iirc, groq used to be making resnet-500? chips? the only way
         | such an impressive setup makes any kind of sense (my guess)
         | would be they bought a bunch of resnet chips way back when and
         | now they are trying to square peg in round hole that sunk cost
         | as part of a fake it till you make it phase. they certainly
         | have enough funding to scrap it all and do better... the
         | question is if they will (and why they haven't been able to
         | yet)
        
           | wmf wrote:
           | Yes, Groq requires hundreds or thousands of chips to load an
           | LLM because they didn't predict that LLMs would get as big as
           | they are. The second generation chip can't come soon enough
           | for them.
        
       | benreesman wrote:
       | It would be pretty awesome if AMD sold aggressively into the data
       | center market, or if NVIDIA sold aggressively into the
       | supercompute market.
       | 
       | All discussion on this is so much sports team fandom until Jensen
       | and Lisa stop pillow fighting on the biggest pile of money since
       | that 50 Cent cover and the highest margins since the Dutch East
       | India Company.
       | 
       | I just spent a few days getting FlashAttention and
       | TransformerEngine built with bleeding edge drivers and all the
       | wheels to zero typing install on three generations of NVIDIA and
       | shit (PS you're build advertises working without ninja but fucks
       | up CMake INSTALL paths): this is the unassailable awesomeness
       | that no one can compete with? ptxas spinning a core for longer
       | than it takes to build Linux and sometimes crashing in the
       | process?
       | 
       | No, this is central committee grift sanctioned at the highest
       | levels to the applause of HN folks long NVDA.
        
         | benreesman wrote:
         | And if you really think about who is getting insulted here?
         | It's the NVIDIA and AMD driver and toolchain hackers. I read
         | diffs from those folks: they are polymath ballers. The average
         | HN comment including this one has more bugs in it than a patch
         | from an NVIDIA/AMD driver author.
         | 
         | But no, they're supposedly just too incompetent to run PyTorch
         | or whatever. Fuck that, the hackers kick ass.
         | 
         | This shit is broken because shareholders get more rent when
         | it's broken.
        
           | lhl wrote:
           | Having spent a fair amount of time over the past few years as
           | someone working on top of both Nvidia and AMD toolchains (and
           | having spent decades working up and down the stack from mcus,
           | mobile device, to webscale stacks), I have an alternative
           | take.
           | 
           | I'm smart and experienced enough, and the people working on
           | the toolchains are capable and even smarter, but these
           | systems are just extremely hard/complex. The hardware is
           | buggy, the firmware is buggy, the drivers are buggy, every
           | layer above it (compilers, kernels, multiple layers of
           | frameworks) also all buggy. Add on top of that everyone is
           | implementing as fast as they can, every part is changing
           | constantly, and also a good chunk of the code being used is
           | being written by researchers and grad students. Oh, and the
           | hardware changes dramatically every couple years. Sure all
           | tech is swiss cheese, but this is a particularly
           | unstable/treacherous version of that atm.
           | 
           | BTW, shareholders obviously have no idea whether what's being
           | sold works well or not, but there's no financial incentive
           | (in terms of selling stuff/profiting) if your product doesn't
           | even work/work well enough to compete. And if it were so
           | easy, you would see any number of hardware startups take over
           | from the incumbents. Sure there are network effects, but if I
           | could run PyTorch (or really, just train and inference my
           | models easily) faster and cheaper, me, and I assume lots of
           | other people would switch in a heartbeat. The fact that no
           | one (besides Google arguably if you count having to switch
           | all your implementations over for TPUs) has been able to do
           | it, whether they be startups or multi-billion dollar
           | companies (including Amazon, Meta, and Microsoft, all who
           | have direct financial incentive to and have been trying),
           | points to the problem being elsewhere.
        
             | benreesman wrote:
             | I'm a bit out of my area on the details: but I have CUDA
             | kernels in PyTorch and I've moved around much, much bigger
             | codebases. I write kernels and build others.
             | 
             | I don't doubt that it's a lot of work to maintain a clean
             | userland and present a good interface with clean link ABIs.
             | The people who pull it off and make it look easy are called
             | names like Linus.
             | 
             | But the "everything is buggy" argument is an argument about
             | institutions, not software artifacts. We know how to throw
             | test and fuzz on a component and bang it into shape. Now
             | this is a question of both resources and intention.
             | 
             | But NVIDIA's market cap is like 4 trillion bucks or
             | something.
             | 
             | I want perfect or a firing squad at that number.
        
               | benreesman wrote:
               | Capitalism sounds dope. I hope I live to see it.
        
               | saagarjha wrote:
               | Why are you replying to yourself?
        
               | benreesman wrote:
               | Because unlike an edit, it's timestamped.
        
       | lanza wrote:
       | > Llama 3.1 405B can currently replace junior engineers
       | 
       | lol
        
         | FridgeSeal wrote:
         | Plain grift, or are they high on their own supply?
        
           | datadrivenangel wrote:
           | Both? Both is good.
        
         | ponector wrote:
         | Can LLM join a standup call? Can LLM create a merge request?
         | 
         | At the moment it looks like an experienced engineer can
         | pressure LLM to hallucinate a junior level code.
        
           | Philpax wrote:
           | The argument is that, instead of hiring a junior engineer, a
           | senior engineer can simply produce enough output to match
           | what the junior would have produced and then some.
           | 
           | Of course, that means you won't be able to train them up, at
           | least for now. That being said, even if they "only" reach the
           | level of your average software developer, they're already
           | going to have pretty catastrophic effects on the industry.
           | 
           | As for automated fixes, there are agents that _can_ do that,
           | like Devin (https://devin.ai/), but it's still early days and
           | bug-prone. Check back in a year or so.
        
             | newsclues wrote:
             | Not training new workers and relying on senior engineers
             | with tools is short sighted and foolish.
             | 
             | LLMs seem to be accelerating the trend
        
               | baobabKoodaa wrote:
               | There's an incentive problem because the benefit from
               | training new workers is distributed across all companies
               | whereas the cost of training them is allocated to the
               | single company that does so
        
               | newsclues wrote:
               | Most broken systems have bad incentives.
               | 
               | Companies don't want to train people ($) because
               | employees with skills and experience are more valuable to
               | other companies because retention is also expensive.
               | 
               | We are not training AND retaining talent.
        
               | Philpax wrote:
               | On one hand, I somewhat agree; on the other hand, I think
               | LLMs and similar tooling will allow juniors to punch far
               | beyond their weight and learn and do things that they
               | would have never have dreamed of before. As mentioned in
               | another comment, they're the teacher that never gets
               | tired and can answer any question (with the necessary
               | qualifications about correctness, learning the answer but
               | not the reasoning, etc)
               | 
               | It remains to be seen if juniors can obtain the necessary
               | institutional / "real work" experience from that, but
               | given the number of self-taught programmers I know, I
               | wouldn't rule it out.
        
               | newsclues wrote:
               | I think many people using llms are faking it and have no
               | interest in "making it".
               | 
               | It's not about learning for most.
               | 
               | Just because a small subset of intelligent and motivated
               | people use tools to become better programmers, there is a
               | larger number of people that will use the tools to
               | "cheat".
        
               | brookst wrote:
               | Tools are foolish? Like, should we remove all of the
               | other tools that make senior engineers more productive,
               | in favor of hiring more people to do those same tasks?
               | That seems questionable.
        
               | newsclues wrote:
               | Tools are great, but there is a way to learn the
               | fundamentals and progress through skills and technology.
               | 
               | Learn to do something manually and then learn the
               | technology.
               | 
               | Do you want engineers who are useless if their calculator
               | breaks or do you want someone who can fall back on pen
               | and paper and get the work done?
        
       | ksec wrote:
       | >Given that ARM is aggressively raising license prices and
       | royalty rates and has (metaphorically) nuked Qualcomm, RISC-V CPU
       | IP clearly has a bright future.
       | 
       | It is interesting the world ( or may be RISC-V world ) seems to
       | think any company can simply break the contract and agreement
       | others signed, while pretending they are still righteous. At the
       | same time thinking this wont happen to their IP.
       | 
       | Well I think we will all know soon.
        
         | klelatti wrote:
         | It is informative to note the contrast between the fact that 22
         | companies (Microsoft, Google, Samsung, etc, etc) were all given
         | enough notice in advance of the Nuvia acquisition, to be able
         | to come up with a quote endorsing it. [1]
         | 
         | And the treatment of the 'partner' upon whose IP the deal
         | depends:
         | 
         | > Neither Qualcomm nor Nuvia provided prior notice of this
         | transaction to Arm [2]
         | 
         | [1] https://www.qualcomm.com/news/releases/2021/01/qualcomm-
         | acqu...
         | 
         | [2] From Arm's initial court submission.
        
           | brookst wrote:
           | I read that part differently. I didn't see it as a statement
           | about the legal merits of ARM's choices so much as the
           | pragmatic business outcomes. It is possible for an action to
           | be 100% legally correct _and_ a bad business move (see:
           | creative IP companies that sue their biggest fans).
           | 
           | Even granting that ARM is completely within their right, this
           | is a pretty bug reminder of the dangers of building on sole-
           | source licensed IP when there's an open alternative. I have
           | little doubt that ARM drove some number of startups into
           | choosing RISC-V.
           | 
           | Which may or may not matter in the long run. But it is
           | rolling the dice and by definition a short term maximization
           | strategy.
        
             | klelatti wrote:
             | My comment wasn't really around legal or business merits
             | but rather the bad faith nature of these actions which
             | would apply whether using open or closed source ip.
             | 
             | However, the 'dangers of building on sole-source' licensed
             | IP is true in general but a bit overdone in this case.
             | Qualcomm had a large number of options using Arm IP
             | including developing their own cores in house. However they
             | chose to do something that was - on the face of it - in
             | breach of license agreements that Nuvia had signed.
        
               | MobiusHorizons wrote:
               | > including developing their own cores in house.
               | 
               | I believe ARM just revoked Qualcomm's architecture
               | license, so they no longer have that option if they want
               | to run aarch64.
               | 
               | The architecture license predated the Nuvia acquisition,
               | and was the license under which Qualcomm developed their
               | most recent laptop core.
        
               | adrian_b wrote:
               | Because the Qualcomm and Nuvia ALAs have remained secret,
               | we cannot know whether Arm really has the right to revoke
               | them at will.
               | 
               | It is hard to believe that anyone would accept to sign an
               | ALA that can easily be revoked, because whoever signs an
               | ALA intends to invest a lot of money developing products
               | based on it and there is no doubt that all the invested
               | money can be lost if the ALA is revoked unilaterally.
               | 
               | It remains to be seen what will happen at the trial.
        
           | adrian_b wrote:
           | The architecture license agreements between Qualcomm, Nuvia
           | and Arm have remained secret.
           | 
           | Both Qualcomm and Arm claim that they follow the requirements
           | of the existing ALAs, while the other party has breached
           | them.
           | 
           | Without access to the original texts, we cannot verify who
           | tells the truth.
           | 
           | Even when a judge will give a decision for this conflict we
           | will not be able to know who is right if the ALAs remain
           | secret, because in recent years there have been plenty of
           | absurd judicial decisions, where judges have been obviously
           | wrong.
           | 
           | According to Qualcomm, they had no obligation to notify Arm
           | about their business intentions. While I do not like Qualcomm
           | for many reasons, for now there is no ground to believe more
           | the claims of Arm than the claims of Qualcomm, so any of the
           | two could be right about this.
           | 
           | Perhaps Arm is right and Qualcomm must pay the royalties
           | specified in the Nuvia ALA instead of the royalties specified
           | in the Qualcomm ALA, which is the main subject of this
           | conflict, but if this is true then Nuvia has been conned by
           | Arm into signing a really disadvantageous ALA, which is
           | somewhat surprising, because among their business plan
           | variants must have been one of being bought by some big
           | company, in which case their target should have been to
           | retain ownership of the IP developed by themselves.
           | 
           | There is no doubt that even if Qualcomm pays only the low
           | royalties specified in the Qualcomm ALA, Arm will obtain a
           | revenue orders of magnitude greater than any revenue that
           | could have been obtained from Nuvia, had it not been bought
           | by Qualcomm.
           | 
           | The reason why Arm does not like this increased revenue from
           | the Nuvia work is that Qualcomm has decided to replace all
           | the cores licensed from Arm in all of their products with
           | cores developed by the Nuvia team.
           | 
           | Thus the loss of the royalties for the Arm-designed cores
           | will be even greater than the increased revenue from the ALA
           | royalties.
           | 
           | So Arm attempts to use or abuse whatever was written in their
           | ALAs in order to prevent competition in the cores
           | implementing the Arm architecture.
           | 
           | Even if Arm had been cunning enough to have included
           | paragraphs in their ALAs that justify their current requests,
           | there is no doubt that the real villain in this conflict is
           | Arm, who fights for forcing the use of CPU cores that are
           | weaker and more expensive than what is possible.
        
             | Voultapher wrote:
             | I think you got it the wrong way around. Nuvia got
             | allegedly got a really good ALA because they targeted
             | server and where small, but then Qualcomm bought Nuvia and
             | wants to produce smartphone chips with the cheaper Nuvia
             | ALA, based on Nuvia IP. Arm argues they have to use the
             | more expensive Qualcomm ALA.
        
               | klelatti wrote:
               | No, the parent comment is correct (on the relative fees
               | at least). Qualcomm wants to sell Nuvia-derived cores
               | under its own ALA at a lower fee than Nuvia would have
               | had to pay (which makes sense as server cores cost more
               | than smartphone cores).
        
             | klelatti wrote:
             | > Nuvia has been conned by Arm
             | 
             | > So Arm attempts to use or abuse whatever was written in
             | their ALAs
             | 
             | > Even if Arm had been cunning enough to have included
             | paragraphs in their ALAs
             | 
             | Clauses requiring consent for transfer of IP rights (or on
             | change of control) are standard everywhere and Nuvia would
             | have been fully aware of this at the time of signing the
             | ALA.
        
             | ksec wrote:
             | >where judges have been obviously wrong.
             | 
             | True. But if Qualcomm think they are not playing any
             | tricks, playing by the rules clearly stated in the
             | contract. You can bet they will appeal and fight on even if
             | they lose.
             | 
             | A lot of juicy details will come out during the trial. Just
             | like the Apple vs Qualcomm case ( Where Apple was clearly
             | in the wrong ). We will all know soon and make up our own
             | judgement.
        
         | fidotron wrote:
         | There seems to be a growing awareness in the RISC-V world that
         | they have made the mistake of being generals refighting their
         | previous war, and things have moved on without them noticing,
         | but unfortunately they painted themselves into a corner of
         | inappropriate technical decisions, and this is manifesting in
         | comments like those quoted.
        
           | mnau wrote:
           | What do you mean by "inappropriate technical decisions"?
        
             | fidotron wrote:
             | That is the wrong question.
             | 
             | The right question is how have things have moved on, then
             | you can reevaluate the technical choices and understand why
             | they are inappropriate.
        
               | panick21_ wrote:
               | The right question is 'what are you even talking about?'.
        
               | fidotron wrote:
               | A serious answer as to why I didn't answer the question
               | directly?
               | 
               | The RISC-V community is dominated by a culture that is
               | fighting the semiconductor war of 20 years ago.
               | Consequently their core ideas are those that were in play
               | then, and they remain steadfastly attached to them. In
               | this case the sea of baby cores is actually an ideal
               | application of RISC-V.
               | 
               | You cannot point out the core problem for a multitude of
               | reasons, one of which is that if you did it would just
               | turn to "prove it", and honestly another is that even
               | though many of us would like an alternative ISA to
               | succeed we do kind of enjoy watching people like SiFive
               | spin round in circles. A strongly related problem is the
               | likes of SiFive are promoting premature ossification
               | before the thing has evolved to practical utility.
        
               | GregarianChild wrote:
               | I agree that the RISC-V ecosystem has some issues that I
               | hope will be sorted out. But not all RISC-V cores are
               | "baby cores". I imagine that XiangShan [1], albeit
               | student work, will spur work towards bigger, more
               | performant OOO cores.
               | 
               | [1] https://github.com/OpenXiangShan/XiangShan
        
           | MobiusHorizons wrote:
           | Since you haven't said anything specific, it's hard to
           | interact with the comment without guessing at what you mean.
           | But I'll give it a go:
           | 
           | > generals refighting their previous war
           | 
           | Here I think you are talking about the risc vs cisc debate.
           | It is true that in modern OOO cores this is basically
           | meaningless. I think the only truly relevant piece is fixed
           | encoding widths, which make wider decoders feasible on a
           | lower power budget.
           | 
           | On the other hand I think risc-v owes a lot of its popularity
           | to how easy it is to understand and implement a bare bones
           | core. Marketing will be a huge challenge for any newcomer,
           | but risc-v is doing very well on that score.
           | 
           | The place I think risc-v has made really good innovations in
           | ISA design is in the vector extension. It seems to me that it
           | allows for code to be written for larger vectors and the
           | machine can apply whatever vector width it has available. I
           | believe this should allow new cores to improve the
           | performance of old code in a way that more explicit designs
           | like AVX have struggled with.
        
             | zozbot234 wrote:
             | AIUI the jury is still out as to whether wider decoders are
             | feasible with RISC-V's 16-bit and 32-bit split. It's
             | specifically designed to ensure that insn lengths and
             | positions can be decoded as easily as possible, without
             | introducing overly inflexible notions such as VLIW-like
             | "bundles".
        
               | ironhaven wrote:
               | Modern x86 cores can decode 9 variable length
               | instructions per clock. It's not easy but can be done.
               | 
               | RISK-V C instructions are on the other hand are trivial.
               | All compressed instructions are 1-1 translatable to
               | uncompressed instructions. People have measured the cost
               | of adding C extension decoding as being a few hundred
               | GATES.
        
         | wmf wrote:
         | _It is interesting the world seems to think any company can
         | simply break the contract and agreement others signed, while
         | pretending they are still righteous._
         | 
         | I love this comment because it could apply to both sides of the
         | case. However the dispute is a little more complex than most
         | commentary implies.
        
       | tails4e wrote:
       | One concern Ive heard is that Llama can be a great accelerator
       | for senior engineers who know what they want. However for junior
       | engineers it could hold back their learning as they just use
       | whatever it gives without really understanding what or why
       | something was done that way. It seems plausible to me. At least
       | using stack overflow people get explanations of why you do x to
       | get y, it's a bit copy paste, sure, but the pace brought
       | learning. Now if you type a comment and get code, a junior
       | engineer may not even read what's generated as 'theu don't need
       | to, it usually works' .... Any ideas how we can ensure junior
       | engineers do effectively learn and understand what is going on,
       | while still getting the LLM benefit?
        
         | ripped_britches wrote:
         | I've been coding for a long time and just became very
         | proficient in dart this year just by using LLM and asking it to
         | explain anything I don't know. I haven't been on Stack Overflow
         | in 2 years and it's not because I'm taking shortcuts by letting
         | the LLM write my code, it's because it is the best teacher /
         | paired programmer that will just sit with you with endless
         | patience. Especially using @docs with cursor and @directory
         | context.
        
           | zkry wrote:
           | > I've been coding for a long time...
           | 
           | I think having been coding for a long time, I don't think you
           | fall into the same category. Dart having paradigms not too
           | different from other standard languages, a lot of these
           | skills are probably transferable.
           | 
           | I've seen beginners on the other hand using LLMs who couldn't
           | even write a proper for-loop without AI assistance. They lack
           | the fundamental ability to "run code in their head." This
           | type of person I feel would be utterly limited by the
           | capabilities of the AI model and would fail in lockstep with
           | it.
        
             | brookst wrote:
             | This is kind of the classic "kids these days" argument:
             | that because we understand something from what we consider
             | the foundational level up to what we consider the goal
             | level, anyone who comes along later and starts at higher
             | levels will be limited / less capable / etc.
             | 
             | It is true, but also irrelevant. Just like most programmers
             | today do not need to understand CPU architecture or even
             | assembly language, programmers in the future will not need
             | to understand for loops the way we do.
             | 
             | They will get good at writing LLM-optimized specifications
             | that produce the desired results. And that will be fine,
             | even if we old-timers bemoan that they don't really program
             | the way we do.
             | 
             | Yes, the abstractions required will be inefficient. And we
             | will always be able to say that when those kids solve our
             | kinds of problems, our methods are better. Just like
             | assembly programmers can look at simple python programs and
             | be astounded at the complexity "required" to solve simple
             | problems.
        
               | zkry wrote:
               | I agree with this take actually. I do imagine how
               | programming in the future could be comprised of mostly
               | interactions with LLMs. Such interactions would probably
               | constrained enough to get the success rate of LLMs
               | sufficiently high, maybe involving specialized DSLs made
               | for LLMs.
               | 
               | I do think the future may be more varied. Just like today
               | where I look at kernel/systems/DB engineering and it
               | seems almost arcane to me, I feel like there will be
               | another stratum created, working on things where LLMs
               | don't suffice.
               | 
               | A lot of this will also depend on how far LLMs get. I
               | would think that there would have to be more ChatGPT-like
               | breakthroughs before this new type of developer can come.
        
               | rafaelmn wrote:
               | I feel like you underestimate how much effort goes into
               | making CPUs reliable and how low level/well specified the
               | problem of building a VM/compiler is compared to getting
               | a natural language specification to executable program.
               | Solving that reliably is basically AGI - I doubt there
               | will be many humans in the loop if we reach that point.
        
           | rafaelmn wrote:
           | I am in a similar situation where I needed to jump in on a
           | python Django codebase for a side gig I'm helping a friend
           | out on. Since I have no interest in using Django in the
           | future I decided to Claude my way trough the project - I did
           | use Django like 10 years ago, am fairly competent with python
           | and used rails recently - so I thought how bad can this be.
           | By the time I had something cobbled together with Claude I
           | decided to get a friend who's competent with Django to review
           | my code and boy did I feel like an amateur. Not only did the
           | code not use good practices/patterns (eg. not even using
           | viewsets from DRF), I couldn't even follow in the
           | conversation because I knew nothing about these concepts. So
           | I spent a day reading docs and looking at a standard example
           | of a idiomatic setup and it made my project a lot better.
           | I've had this experience at multiple points in the project -
           | not reading the API docs and letting Claude walk me through
           | integration left me stuck when Claude failed, design
           | decisions.
           | 
           | So I would say Claude is useful for simple execution when you
           | know what you are expecting, relying on it to learn sounds
           | like a short term gain for problems down the line. At a point
           | where LLMs can reliably lookup sources and reason trough
           | something to explain it there will be no coding left, but I
           | feel we aren't close to that with current tech.
        
         | lionkor wrote:
         | Probably by not letting them check in code they don't
         | understand. This creates a lot of overhead, but it turns out
         | you simply can't skip that time investment. Somewhere along the
         | junior->senior developer (in terms of skill, not time) is a big
         | time cost of just sitting down and learning. You can pretend to
         | skip it with LLMs, but you're just delaying having to
         | understand it.
         | 
         | So the benefit of LLMs is negligible in the long term, for
         | beginner/junior programmers, as they essentially collect
         | knowledge debt. Don't let your juniors use LLMs, or if you do,
         | make sure they can explain every little detail about the code
         | they have written. You don't have to be annoying about it - ask
         | socratically.
        
         | blueboo wrote:
         | Despite your observation that SO has explanations, SO can he
         | and has also been used as a zero-learning crutch.
         | 
         | Similarly, LLMs can be used as educational tools.
         | 
         | In the end, learning and self-improvement needs _some_ non-
         | trivial motivation from the individual.
         | 
         | The answer to your question is to show them it's valuable to
         | learn. If they find they're completing their tasks adequately
         | from AI assistance, then give them harder tasks. Meanwhile,
         | model how your own learning effort is paying off.
         | 
         | Note how if they are left with AI-trivial tasks and the benefit
         | of learning remains abstract, we shouldn't expect anything to
         | change.
        
       | zozbot234 wrote:
       | It's not really "AI hardware", it's just HPC on a slightly
       | smaller scale than the traditional sort. And it's still going to
       | be useful for plenty of workloads regardless of the current AI
       | frenzy.
        
         | latchkey wrote:
         | Like what workloads? Genuinely curious what you'd run on it.
        
       | Animats wrote:
       | > Tenstorrent has an implicate view that the future of AI is
       | mixed-workload, not pure linear algebra spam. Yes, MATMUL go
       | BRRRRRR is valuable, but CPU workloads will be needed in the
       | future. That is the hope. So far, this has not played out.
       | 
       | At least not in the training end of machine learning, which is
       | the big money sink.
       | 
       | What striking about this whole business is that it's still mostly
       | hammering on a decades-old idea with more and more hardware and
       | data. That worked great for a few years, but now seems to be
       | reaching its limits. What's next?
        
       ___________________________________________________________________
       (page generated 2024-12-15 23:01 UTC)