[HN Gopher] Tabby: Self-hosted AI coding assistant
___________________________________________________________________
Tabby: Self-hosted AI coding assistant
Author : saikatsg
Score : 340 points
Date : 2025-01-12 18:43 UTC (1 days ago)
(HTM) web link (github.com)
(TXT) w3m dump (github.com)
| thecal wrote:
| Unfortunate name. Can you connect Tabby to the OpenAI-compatible
| TabbyAPI? https://github.com/theroyallab/tabbyAPI
| mbernstein wrote:
| At least per Github, the TabbyML project is older than the
| TabbyAPI project.
| mynameisvlad wrote:
| Also, _wildly_ more popular, to the tune of several
| magnitudes more forks and stars. If anything, this question
| should be asked of the TabbyAPI project.
| karolist wrote:
| I'm not sure what's going on with TabbyAPI's github
| metrics, but exl2 quants are very popular among nvidia
| local LLM crowd and TabbyAPI comes in tons of reddit posts
| of people using it. Might be just my bubble, not saying
| they're not accurate, just generally surprised such a
| useful project has under 1k stars. On the flip side, LLMs
| will hallucinate about TabbyML if you ask it TabbyAPI
| related questions, so I'd agree the naming is unfortunate.
| Medox wrote:
| I though that Tabby, the ssh client [1], got AI capabilities...
|
| [1] https://github.com/Eugeny/tabby
| wsxiaoys wrote:
| Never imagined our project would make it to the HN front page on
| Sunday!
|
| Tabby has undergone significant development since its launch two
| years ago [0]. It is now a comprehensive AI developer platform
| featuring code completion and a codebase chat, with a team [1] /
| enterprise focus (SSO, Access Control, User Authentication).
|
| Tabby's adopters [2][3] have discovered that Tabby is the only
| platform providing a fully self-service onboarding experience as
| an on-prem offering. It also delivers performance that rivals
| other options in the market. If you're curious, I encourage you
| to give it a try!
|
| [0]: https://www.tabbyml.com
|
| [1]: https://demo.tabbyml.com/search/how-to-add-an-embedding-
| api-...
|
| [2]: https://www.reddit.com/r/LocalLLaMA/s/lznmkWJhAZ
|
| [3]: https://www.linkedin.com/posts/kelvinmu_last-week-i-
| introduc...
| maille wrote:
| Do you have a plugin for MSVC?
| wsxiaoys wrote:
| Not yet, consider subscribe
| https://github.com/TabbyML/tabby/issues/322 for future
| updates!
| somberi wrote:
| https://github.com/codespin-ai/codespin-vscode-extension
| tootie wrote:
| Is it only compatible with Nvidia and Apple? Will this work
| with an AMD GPU?
| wsxiaoys wrote:
| Yes - AMD GPU is supported through vulkan backend:
|
| https://github.com/TabbyML/tabby/releases/tag/v0.23.0
|
| https://tabby.tabbyml.com/blog/2024/05/01/vulkan-support/
| thih9 wrote:
| As someone unfamiliar with local AIs and eager to try, how does
| the "run tabby in 1 minute"[1] compare to e.g. chatgpt's free
| 4o-mini? Can I run that docker command on a medium specced
| macbook pro and have an AI that is comparably fast and capable?
| Or are we not there (yet)?
|
| Edit: looks like there is a separate page with instructions for
| macbooks[2] that has more context.
|
| > The compute power of M1/M2 is limited and is likely to be
| sufficient only for individual usage. If you require a shared
| instance for a team, we recommend considering Docker hosting with
| CUDA or ROCm.
|
| [1]: https://github.com/TabbyML/tabby#run-tabby-in-1-minute
| docker run -it --gpus all -p 8080:8080 -v $HOME/.tabby:/data
| tabbyml/tabby serve --model StarCoder-1B --device cuda --chat-
| model Qwen2-1.5B-Instruct
|
| [2]: https://tabby.tabbyml.com/docs/quick-
| start/installation/appl...
| eric-burel wrote:
| Side question : open source models tend to be less "smart" than
| private ones, do you intend to compensate by providing a better
| context (eg query relevant technology docs to feed context)?
| coder543 wrote:
| gpt-4o-mini might not be the best point of reference for what
| good LLMs can do with code:
| https://aider.chat/docs/leaderboards/#aider-polyglot-benchma...
|
| A teeny tiny model such as a 1.5B model is really dumb, and
| _not good_ at interactively generating code in a conversational
| way, but models in the 3B or less size can do a good job of
| suggesting tab completions.
|
| There are larger "open" models (in the 32B - 70B range) that
| you can run locally that should be much, much better than
| gpt-4o-mini at just about everything, including writing code.
| For a few examples, llama3.3-70b-instruct and
| qwen2.5-coder-32b-instruct are pretty good. If you're really
| pressed for RAM, qwen2.5-coder-7b-instruct or codegemma-7b-it
| might be okay for some simple things.
|
| > medium specced macbook pro
|
| medium specced doesn't mean much. How much RAM do you have?
| Each "B" (billion) of parameters is going to require _about_
| 1GB of RAM, as a rule of thumb. (500MB for really heavily
| quantized models, 2GB for un-quantized models... but, 8-bit
| quants use 1GB, and that 's usually fine.)
| eurekin wrote:
| Also context size significantly impacts ram/vram usage and in
| programming those chats get big quickly
| Ringz wrote:
| Thanks for your explanation! Very helpful!
| mjrpes wrote:
| What is the recommended hardware? GPU required? Could this run OK
| on an older Ryzen APU (Zen 3 with Vega 7 graphics)?
| wsxiaoys wrote:
| Check https://www.reddit.com/r/LocalLLaMA/s/lznmkWJhAZ to see a
| local setup with 3090.
| mkl wrote:
| That thread doesn't seem to mention hardware. It would be
| really helpful to just put hardware requirements in the
| GitHub README.
| coder543 wrote:
| The usual bottleneck for self-hosted LLMs is memory bandwidth.
| It doesn't really matter if there are integrated graphics or
| not... the models will run at the same (very slow) speed on
| CPU-only. Macs are only decent for LLMs because Apple has given
| Apple Silicon unusually high memory bandwidth, but they're
| still nowhere near as fast as a high-end GPU with _extremely_
| fast VRAM.
|
| For extremely tiny models like you would use for tab
| completion, even an old AMD CPU is probably going to do okay.
| mjrpes wrote:
| Good to know. It also looks like you can host TabbyML as an
| on-premise server with docker and serve requests over a
| private network. Interesting to think that a self-hosted GPU
| server might become a thing.
| jslakro wrote:
| Duplicated https://news.ycombinator.com/item?id=35470915
| mkl wrote:
| Not a dupe, as that was nearly two years ago.
| https://news.ycombinator.com/newsfaq.html#reposts
| jslakro wrote:
| In that case I'm going to start reposting all good old links.
| leke wrote:
| So does this run on your personal machine, or can you install it
| on a local company server and have everyone in the company
| connect to it?
| wsxiaoys wrote:
| Tabby is engineered for team usage, intended to be deployed on
| a shared server. However, with robust local computing
| resources, you can also run Tabby on your individual machine.
| Check https://www.reddit.com/r/LocalLLaMA/s/lznmkWJhAZ to see a
| local setup with 3090.
| d--b wrote:
| Didn't you mean to name it Spacey?
| szszrk wrote:
| I've heard of tab vs spaces flamewars, but never heard of
| "space completion" camp.
|
| It clearly references to LLM doing a tab completion.
| d--b wrote:
| WHAT? TAB COMPLETION? YOU CANT BE SERIOUS.
|
| Just joking. But yeah, Space completion is definitely a
| thing. Also triggering suggestions is often Ctrl+Space
| szszrk wrote:
| "Ctrl+space" should be air/space related software house
| name :)
| thedangler wrote:
| How would I tell this to use an api framework it doesn't know ?
| wsxiaoys wrote:
| Tabby comes with builtin RAG support so you can add this api
| framework to it.
|
| Example: https://demo.tabbyml.com/search/how-to-configure-sso-
| in-tabb...
|
| Settings page: https://demo.tabbyml.com/settings/providers/doc
| SOLAR_FIELDS wrote:
| > How to utilize multiple NVIDIA GPUs?
|
| | Tabby only supports the use of a single GPU. To utilize
| multiple GPUs, you can initiate multiple Tabby instances and set
| CUDA_VISIBLE_DEVICES (for cuda) or HIP_VISIBLE_DEVICES (for rocm)
| accordingly.
|
| So using 2 NVLinked GPU's with inference is not supported? Or is
| that situation different because NVLink treats the two GPU as a
| single one?
| wsxiaoys wrote:
| > So using 2 NVLinked GPU's with inference is not supported?
|
| To make better use of multiple GPUs, we suggest employing a
| dedicated backend for serving the model. Please refer to
| https://tabby.tabbyml.com/docs/references/models-http-api/vl...
| for an example
| SOLAR_FIELDS wrote:
| I see. So this is like, I can have tabby be my LLM server
| with this limitation or I can just turn that feature off and
| point tabby at my self hosted LLM as any other OpenAI
| compatible endpoint?
| wsxiaoys wrote:
| Yes - however, the FIM model requires careful configuration
| to properly set the prompt template.
| KronisLV wrote:
| For something similar I use Continue.dev with ollama, it's always
| nice to see more tools in the space! But as usual, you need
| pretty formidable hardware to run the actually good models, like
| the 32B version of Qwen2.5-coder.
| st3fan wrote:
| The demo on the homepage for the completion of the findMaxElement
| function is a good example of what is to come. Or maybe where we
| are at now?
|
| The six lines of Python suggested for that function can also be
| replaced with a simple "return max(arr)". The suggested code
| works but is absolute junior level.
|
| I am terrified of what is to come. Not just horrible code but
| also how people who blindly "autocomplete" this code are going to
| stall in their skill level progress.
|
| You may score some story points but did you actually get any
| better at your craft?
| MaKey wrote:
| The silver lining is that the value of your skills is going up.
| shcheklein wrote:
| On the other hand it might become a next level of abstraction.
|
| Machine -> Asm -> C -> Python -> LLM (Human language)
|
| It compiles human prompt into some intermediate code (in this
| case Python). Probably initial version of CPython was not
| perfect at all, and engineers were also terrified. If we are
| lucky this new "compiler" will be becoming better and better,
| more efficient. Never perfect, but people will be paying the
| same price they are already paying for not dealing directly
| with ASM.
| MVissers wrote:
| Yup!
|
| No goal to become a programmer- But I like to build programs.
|
| Build a rather complex AI-ecosystem simulator with me as the
| director and GPT-4 now Claude 3.5 as the programmer.
|
| Would never have been able to do this beforehand.
| saurik wrote:
| I think there is a big difference between an abstraction
| layer that can improve -- one where you maybe write "code" in
| prompts and then have a compiler build through real code,
| allowing that compiler to get better over time -- and an
| interactive tool that locks bad decisions autocompleted today
| into both your codebase and your brain, involving you still
| working at the lower layer but getting low quality "help" in
| your editor. I am totally pro- compilers and high-level
| languages, but I think the idea of writing assembly with the
| help of a partial compiler where you kind of write stuff and
| then copy/paste the result into your assembly file with some
| munging to fix issues is dumb.
|
| By all means, though: if someone gets us to the point where
| the "code" I am checking in is a bunch of English -- for
| which I will likely need a law degree in addition to an
| engineering background to not get evil genie with a cursed
| paw results from it trying to figure out what I must have
| meant from what I said :/ -- I will think that's pretty cool
| and will actually be a new layer of abstraction in the same
| class as compiler... and like, if at that point I don't use
| it, it will only be because I think it is somehow dangerous
| to humanity itself (and even then I will admit that it is
| probably more effective)... but we aren't there yet and
| "we're on the way there" doesn't count anywhere near as much
| as people often want it to ;P.
| sdesol wrote:
| > Machine -> Asm -> C -> Python -> LLM (Human language)
|
| Something that you neglected to mention is, with every
| abstraction layer up to Python, everything is predictable and
| repeatable. With LLMs, we can give the exact same
| instructions, and not be guaranteed the same code.
| 12345hn6789 wrote:
| assuming you have full control over which compiler youre
| using for each step ;)
|
| What's to say LLMs will not have a "compiler" interface in
| the future that will reign in their variance
| sdesol wrote:
| > assuming you have full control over which compiler
| youre using for each step ;)
|
| With existing tools, we know if we need to do something,
| we can. The issue with LLMs, is they are very much black
| boxes.
|
| > What's to say LLMs will not have a "compiler" interface
| in the future that will reign in their variance
|
| Honestly, having a compiler interface for LLMs isn't a
| bad idea...for some use cases. What I don't see us being
| able to do is use natural language to build complex apps
| in a deterministic manner. Solving this problem would
| require turning LLMs into deterministic machines, which I
| don't believe will be an easy task, given how LLMs work
| today.
|
| I'm a strong believer in that LLMs will change how we
| develop and create software development tools. In the
| past, you would need Google and Microsoft level of
| funding to integrate natural language into a tool, but
| with LLMs, we can easily have LLMs parse input and have
| it map to deterministic functions in days.
| theptip wrote:
| I'm not sure why that matters here. Users want code that
| solves their business need. In general most don't care
| about repeatability if someone else tries to solve their
| problem.
|
| The question that matters is: can businesses solve their
| problems cheaper for the same quality, or at lower quality
| while beating the previous Pareto-optimal cost/quality
| frontier.
| thesz wrote:
| Recognizable repetition can be abstracted, reducing code
| base and its (running) support cost.
|
| The question that matters is: will businesses crumble due
| to overproduction of same (or lower) quality code sooner
| or later.
| chii wrote:
| > The question that matters is: will businesses crumble
| due to overproduction of same (or lower) quality code
| sooner or later.
|
| but why doesn't that happen today? Cheap code can be had
| by hiring in cheap locations (outsourced for example).
|
| The reality is that customers are the ultimate arbiters,
| and if it satisfies them, the business will not collapse.
| And i have not seen a single customer demonstrate that
| they care about the quality of the code base behind the
| product they enjoy paying for.
| sdesol wrote:
| > Cheap code can be had by hiring in cheap locations
| (outsourced for example).
|
| If you outsource and like what you get, you would assume
| the place you outsourced to can help provide continued
| support. What assurance do you have with LLMs? A working
| solution doesn't mean it can be easily maintained and/or
| evolved.
|
| > And i have not seen a single customer demonstrate that
| they care about the quality of the code base behind the
| product they enjoy paying for.
|
| That is true, but they will complain if bugs cannot be
| fixed and features are added. It is true that customers
| don't care, and they shouldn't, until it does matter, of
| course.
|
| The challenge with software development isn't necessarily
| with the first iteration, but rather it is with continued
| support. Where I think LLMs can really shine is in
| providing domain experts (those who understand the
| problem) with a better way to demonstrate their needs.
| OvbiousError wrote:
| It is happening. There is a lot of bad software out
| there. Terrible to use, but still functional enough that
| it keeps selling. The question is how much crap you can
| pile on top of that already bad code before it falls
| apart.
| carschno wrote:
| It happens today. However, companies fail for multiple
| problems that come together. Bad software quality (from
| whatever source) is typically not a very visible one
| among them because when business people take over, they
| only see (at most) that software development/maintenance
| cost more money that it could yield.
| thesz wrote:
| > And i have not seen a single customer demonstrate that
| they care about the quality of the code base behind the
| product they enjoy paying for.
|
| The code quality translates to speed of introduction of
| changes, fixes of defects and amount of user-facing
| defects.
|
| While customers may not express any care about code
| quality directly they can and will express
| (dis)satisfaction with performance and defects of the
| product.
| CamperBob2 wrote:
| _Recognizable repetition can be abstracted_
|
| ... which is the whole idea behind training, isn 't it?
|
| _The question that matters is: will businesses crumble
| due to overproduction of same (or lower) quality code
| sooner or later._
|
| The problem is really the opposite -- most programmers
| are employed to create _very_ minor variations on work
| done either by other programmers elsewhere, by other
| programmers in the same organization, or by their own
| younger selves. The resulting inefficiency is massive in
| human terms, not just in managerial metrics. Smart people
| are wasting their lives on pointlessly repetitive work.
|
| When it comes to the art of computer programming, there
| are more painters than there are paintings to create.
| That's why a genuinely-new paradigm is so important, and
| so overdue... and it's why I get so frustrated when
| supposed "hackers" stand in the way.
| thesz wrote:
| >> Recognizable repetition can be abstracted >
| ... which is the whole idea behind training, isn't it?
|
| The comment I was answering specifically dismissed LLM's
| inability to answer same question with same... answer as
| unimportant. My point is that this ability is crucial to
| software engineering - answers to similar problems should
| be as similar as possible.
|
| Also, I bet that LLM's are not trained to abstract. In my
| experience they lately are trained to engage users in
| pointless dialogue as long as possible.
| CamperBob2 wrote:
| No, only the spec is important. How the software
| implements the spec is not important in the least. (To
| the extent that's not true, fix the spec!)
|
| Nor is whether the implementation is the same from one
| build to the next.
| CamperBob2 wrote:
| _With LLMs, we can give the exact same instructions, and
| not be guaranteed the same code._
|
| That's something we'll have to give up and get over.
|
| See also: understanding how the underlying code actually
| works. You don't need to know assembly to use a high-level
| programming language (although it certainly doesn't hurt),
| and you won't need to know a high-level programming
| language to write the functional specs in English that the
| code generator model uses.
|
| I say bring it on. 50+ years was long enough to keep doing
| things the same way.
| omgwtfbyobbq wrote:
| Aren't some models deterministic with temperature set to 0?
| zurn wrote:
| > > Machine -> Asm -> C -> Python -> LLM (Human language)
|
| > Something that you neglected to mention is, with every
| abstraction layer up to Python, everything is predictable
| and repeatable.
|
| As long as you consider C and dragons flying out of your
| nose predictable.
|
| (Insert similar quip about hardware)
| jsjohnst wrote:
| > With LLMs, we can give the exact same instructions, and
| not be guaranteed the same code.
|
| Set temperature appropriately, that problem is then solved,
| no?
| sdesol wrote:
| No, it is much more involved and not all providers allow
| the necessary tweakings. This means you will need to use
| local models (with hardware caveats) which will require
| us to ask:
|
| - Are local models good enough?
|
| - What are we giving up for deterministic behaviour?
|
| For example, will it be much more difficult to write
| prompts. Will the output be nonsensical and more.
| compumetrika wrote:
| LLMs use pseudo-random numbers. You can set the seed and
| get exactly the same output with the same model and input.
| blibble wrote:
| you won't because floating point arithmetic isn't
| associative
|
| and the GPU scheduler isn't deterministic
| threeducks wrote:
| You can set PyTorch to deterministic mode with a small
| performance penalty: https://pytorch.org/docs/stable/note
| s/randomness.html#avoidi...
|
| Unfortunately, this is only deterministic on the same
| hardware, but there is no reason why one couldn't write
| reasonably efficient LLM kernels. It just has not been a
| priority.
|
| Nevertheless, I still agree with the main point that it
| is difficult to get LLMs to produce the same output
| reliably. A small change in the context might trigger all
| kinds of changes in the generated code.
| zajio1am wrote:
| There is no reason to assume that say C compiler generates
| the same machine code for the same source code. AFAIK, a C
| compiler that chooses randomly between multiple
| C-semantically equivalent sequences of instructions is a
| valid C compiler.
| SkyBelow wrote:
| Even compiling code isn't deterministic given different
| compilers and different items installed on a machine can
| influence the final resulting code, right? Ideally they
| shouldn't have any noticeable impact, but in edge cases it
| might, which is why you compile your code once during a
| build step and then deploy the same compiled code to
| different environments instead of compiling it per
| environment.
| vages wrote:
| It may be a "level of abstraction", but not a good one,
| because it is imprecise.
|
| When you want to make changes to the code (which is what we
| spend most of our time on), you'll have to either (1) modify
| the prompt and accept the risk of using the new code or (2)
| modify the original code, which you can't do unless you know
| the lower level of abstraction.
|
| Recommended reading: https://ian-cooper.writeas.com/is-ai-a-
| silver-bullet
| svachalek wrote:
| Keep in mind that this is the stupidest the LLM will ever be
| and we can expect major improvements every few months. On the
| other hand junior devs will always be junior devs. At some
| point python and C++ will be like assembly now, something
| that's always out there but not something the vast majority of
| developers will ever need to read or write.
| sdesol wrote:
| > we can expect major improvements every few months.
|
| I'm not sure this is grounded in reality. We've already seen
| articles related to how OpenAI is behind schedule with GPT-5.
| I do believe things will improve over time, mainly due to
| advancements in hardware. With better hardware, we can better
| brute force correct answers.
|
| > junior devs will always be junior devs
|
| Junior developers turn into senior developers over time.
| smcnally wrote:
| > I'm not sure this is grounded in reality. We've already
| seen articles related to how OpenAI is behind schedule with
| GPT-5.
|
| Progress by Google, meta, Microsoft, Qwen and Deepseek is
| unhampered by OpenAI's schedule. Their latest -- including
| Gemini 2.0, Llama 3.3, Phi 4 -- and the coding fine tunes
| that follow are all pretty good.
| sdesol wrote:
| > unhampered by OpenAI's schedule
|
| Sure, but if the advancements are to catch up to OpenAI,
| then major improvements by other vendors are nice and
| all, but I don't believe that was what the commenter was
| implying. Right now the leaders in my opinion are OpenAI
| and Anthropic and unless they are making major
| improvements every few months, the industry as a whole is
| not making major improvements.
| smcnally wrote:
| OpenAI and Anthropic are definitely among the leaders.
| Playing catch-up to these leaders' mind-share and
| technology is some of the motivation for others. Calling
| the progress being made in the space by Google (Gemini),
| MSFT (Phi), Meta (llama), Alibaba (Qwen) "nice and all"
| is a position you might be pleasantly surprised to
| reconsider if this technology interests you. And don't
| sleep on Apple and AMZ -
|
| In the space covered by Tabby, Copilot, aider, Continue
| and others, capabilities continue to improve considerably
| month-over-month.
|
| In the segments of the industry I care most about, I
| agree 100% with what the commenter said w/r/t expecting
| major improvements every few months. Pay even passing
| attention to huggingface and github and see work being
| done by indies as well as corporate behemoths happening
| at breakneck pace. Some work is pushing the SOTA. Some is
| making the SOTA more widely available. Lots of it is
| different approaches to solving similar challenges. Most
| of it benefits consumers and creators looking use and
| learn from all of this.
| llamaLord wrote:
| My experience observing commercial LLM's since the release of
| GPT-4 is actually the opposite of this.
|
| Sure, they've gotten much cheaper on a per-token basis, but
| that cost reduction has come with a non-trivial
| accuracy/reliability cost.
|
| The problem is, tokens that are 10x cheaper are still useless
| if what they say is straight up wrong.
| maeil wrote:
| > Sure, they've gotten much cheaper on a per-token basis,
| but that cost reduction has come with a non-trivial
| accuracy/reliability cost.
|
| This only holds for OpenAI.
| maeil wrote:
| > Keep in mind that this is the stupidest the LLM will ever
| be and we can expect major improvements every few months.
|
| We have seen no noticable improvements (at usable prices) for
| 7 months, when the original Sonnet 3.5 came out.
|
| Maybe specialized hardware for LLM inference will improve so
| rapidly that o1 (full) will be quick and cheap enough a year
| from now, but it seems extremely unlikely. For the end user,
| the top models hadn't gotten cheaper for kore than a year
| until the release of Deepseek v3 a few weeks ago. Even that
| is currently very slow at non-Deepseek providers, and who
| knows just how subsidized the pricing and speed at Deepseek
| itself is, given political interests.
| Eliezer wrote:
| No major AI advancements for 7 months? Guess everyone's
| jobs are safe for another year, and after that we're all
| dead?
| n144q wrote:
| GitHub Copilot came out in 2021.
| harvodex wrote:
| I wish this was true as being a shitty programmer who is old
| , I would benefit from this as much as anyone here but I
| think it is delusional.
|
| From my experience I wouldn't even say LLMs are stupid. The
| LLM is a carrier and the intelligence is in the training
| data. Unfortunately, the training data is not going to get
| smarter.
|
| If any of this had anything to do with reality then we should
| already have a programming specific model only trained on CS
| and math textbooks that is awesome. Of course, that doesn't
| work because the LLM is not abstracting the concepts how we
| normally think of in order to be stupid or intelligent.
|
| It hardly shocking that next token prediction on math and CS
| textbooks is of limited use. You hardly have to think about
| it to see how flawed the whole idea is.
| runeblaze wrote:
| I mean you can treat it as just a general pseudocode-ish
| implementation of an O(n) find_max algorithm. Tons of people
| use Python to illustrate algorithms.
|
| (Not to hide your point though -- people please review your
| LLM-generated code!)
| tippytippytango wrote:
| This is self correcting. Code of this quality won't let you
| ship things. You are forced to understand the last 20%-30% of
| details the LLM can't help you with to pass all your tests.
| But, it also turns out, to understand the 20% of details the
| LLM couldn't handle, you need to understand the 80% the LLM
| _could_ handle.
|
| I'm just not worried about this, LLMs don't ship.
| tyingq wrote:
| In the case where it write functionally "good enough" code
| that performs terribly, it rewards the LLM vendor...since the
| LLM vendor is also often your IaC vendor. And now you need to
| buy more infra.
| grahamj wrote:
| I sense a new position coming up: slop cleanup engineer
| cootsnuck wrote:
| So an engineer.
| HPsquared wrote:
| That's one hell of a synergy. Win-win-lose
| grahamj wrote:
| This needs to be shouted from the rooftops. If you _could_ do
| it yourself then LLMs can be a great help, speeding things
| up, offering suggestions and alternatives etc.
|
| But if you're asking for something you don't know how to do
| you might end up with junk and not even know it.
| cootsnuck wrote:
| But if that junk doesn't work (which it likely won't for
| any worthwhile problem) then you have to get it working.
| And to get it working you almost always have to figure out
| how the junk code works. And in that process I've found is
| where the real magic happens. You learn by fixing, pruning,
| optimizing.
|
| I think there's a whole meta level of the actual dynamic
| between human<>LLM interactions that is not being
| sufficiently talked about. I think there's, _potentially_ ,
| many secondary benefits that can come from using them
| simply due to the ways you have to react to their outputs
| (if a person decides to rise to that occasion).
| powersnail wrote:
| If the junk doesn't work right from the beginning, yes.
| The problem is that sometimes the junk might look like it
| works at first, and then later you find out that it
| doesn't, and you ended up having to make urgent fixes on
| a Friday night.
|
| > And in that process I've found is where the real magic
| happens
|
| It might be good way to learn if there's someone who's
| supervising the process, so they _know_ that the code is
| incorrect, and tells you to figure out what's wrong and
| how to fixes.
|
| If you are shipping this stuff yourself, this sounds like
| a way of deploying giant foot-guns into production.
|
| I still think it's a better to learn if you try to
| understand the code from the beginning (in the same way
| that a person should try to understand code they read
| from tutorials and stackoverflow), rather than delaying
| the learning until something doesn't work. This is like
| trying to make yourself do reinforcement learning on the
| outputs of an LLM, which sounds really inefficient to me.
| shriek wrote:
| Wait till they come with auto review/merge agents, or maybe
| there already is. _gulp_
| 999900000999 wrote:
| LLMs also love to double down on solutions that don't work.
|
| Case in point, I'm working on a game that's essentially a
| website right now. Since I'm very very bad with web design I'm
| using an LLM.
|
| It's perfect 75% of the time. The other 25% it just doesn't
| work. Multiple LLMs will misunderstand basic tasks. Let's add
| properties and invent functions.
|
| It's like you had hired a college junior who insists their
| never wrong and keeps pushing non functional code.
|
| The entire mindset is whatever it's close enough, good luck.
|
| God forbid you need to do anything using an uncommon node
| module or anything like that.
| smcnally wrote:
| > LLMs also love to double down on solutions that don't work.
|
| "Often wrong but never in doubt" is not proprietary to LLMs.
| It's off-putting and we want them to be correct and to have
| humility when they're wrong. But we should remember LLMs are
| trained on work created by people, and many of those people
| have built successful careers being exceedingly confident in
| solutions that don't work.
| deltaburnt wrote:
| So now you have an overconfident human using an
| overconfident tool, both of which will end up coding
| themselves into a corner? Compilers at least, for the most
| part, offer very definitive feedback that act as guard
| rails to those overconfident humans.
|
| Also, let's not forget LLMs are a product of the internet
| and anonymity. Human interaction on the internet is
| significantly different from in person interaction, where
| typically people are more humble and less overconfident. If
| someone at my office acted like some overconfident
| SO/reddit/HN users I would probably avoid them like the
| plague.
| smcnally wrote:
| A compiler in the mix is very helpful. That and other
| sanity checks wielded by a skilled engineer doing code
| reviews can provide valuable feedback to other developers
| and to LLMs. The knowledgeable human in the loop makes
| the coding process and final products so much better. Two
| LLMs with tool usage capabilities reviewing the code
| isn't as good today but is available today.
|
| The LLMs overconfidence is based on it spitting out the
| most-probable tokens based on its training data and your
| prompt. When LLMs learn real hubris from actual anonymous
| internet jackholes, we will have made significant
| progress toward AGI.
| 999900000999 wrote:
| The issue is LLMs never say:
|
| "I don't know how to do this".
|
| When it comes to programming. Tell me you don't know so I
| can do something else. I ended up just refactoring my UX to
| work around it. In this case it's a personal prototype so
| it's not a big deal.
| smcnally wrote:
| That is definitely an issue with many LLMs. I've had
| limited success including instructions like "Don't invent
| facts" in the system prompt and more success saying "that
| was not correct. Please answer again and check to ensure
| your code works before giving it to me" within the
| context of chats. More success still comes from
| requesting second opinions from a different model -- e.g.
| asking Claude's opinion of Qwen's solution.
|
| To the other point, not admitting to gaps in knowledge or
| experience is also something that people do all the time.
| "I copied & pasted that from the top answer in Stack
| Overflow so it must be correct!" is a direct analog.
| ripped_britches wrote:
| The most underrated thing I do on nearly every cursor
| suggestion is to follow up with "are there any better ways to
| do this?".
| smcnally wrote:
| A deeper version of the same idea is to ask a second model to
| check the first model's answers. aider's "architect" is an
| automated version of this approach.
|
| https://aider.chat/docs/usage/modes.html#architect-mode-
| and-...
| avandekleut wrote:
| I always ask it to "analyze approached to achieve X and then
| make a suggestion, no code" in the chat. Then a refinement
| step where I give feedback on the generated code. I also
| always try to give it an "out" between making changes and
| keeping it to same to stave off the bias of action.
| cootsnuck wrote:
| Yea, the "analyze and explain but no code yet" approach
| works well. Let's me audit its approach beforehand.
| csomar wrote:
| > I am terrified of what is to come.
|
| Don't worry. Like everything else in life, you get what you pay
| for.
| 55555 wrote:
| I used to know things. Then they made Google, and I just looked
| things up. But at least I could still do things. Now we have
| AI, and I just ask it to do things for me. Now I don't know
| anything and I can't do anything.
| deltaburnt wrote:
| I feel like I've seen this comment so many times but actually
| genuine. The cult like dedication is kind of baffling.
| nyarlathotep_ wrote:
| Programmers (and adjacent positions) of late strike me as
| remarkably shortsighted and myopic.
|
| Cheering for remote work leading to loads of new positions
| being offered overseas opposed to domestically, and now
| loudly celebrating LLMs writing "boilerplate" for them.
|
| How folks don't see the consequences of their actions is
| remarkable to me.
| worble wrote:
| > The suggested code works but is absolute junior level
|
| This isn't far the current status quo. Good software companies
| pay for people who write top quality code, and the rest pay
| juniors to work far above their pay grade or offshore it to the
| cheapest bidder. Now it will be offloaded to LLM's instead.
| Same code, different writer, same work for a contractor who
| knows what they're doing to come and fix later.
|
| And so the cycle continues.
| dizhn wrote:
| Anybody care to comment whether the quality of the existing
| code influences how good the AI's assistance is? In other
| words, would they suggest sloppy code where the existing code
| is sloppy and better (?) code when the existing code is good?
| cootsnuck wrote:
| What do you think? (I don't mean that in a snarky way.) Based
| on how LLMs work, I can't see how that would not be the case.
|
| But in my experience there are nuances to this. It's less
| about "good" vs "bad"/"sloppy" code and more about
| discernable. If it's discernably sloppy (i.e. the type of
| sloppy a beginning programmer might do which is familiar to
| all of us) I would say that's better than opaque "good" code
| (good really only meaning functional).
|
| These things predict tokens. So when you use them, help them
| increase their chances of predicting the thing you want. Good
| comments on code, good function names, explain what you don't
| know, etc. etc. The same things you would ideally do if
| working with another person on a codebase.
| sirsinsalot wrote:
| Reminds me of the 2000s outsourcing hype. I made a lot of money
| cleaning up that mess. Entire projects late, buggy, unreadable
| and unmaintainable.
|
| Business pay big when they need to recover from that kind of
| thing and save face to investors.
| generalizations wrote:
| > people who blindly "autocomplete" this code are going to
| stall in their skill level progress
|
| AI is just going to widen the skill level bell curve. Enables
| some people to get away with far more mediocre work than
| before, but also enables some people to become far more
| capable. You can't make someone put in more effort, but the
| ones who do will really shine.
| shihab wrote:
| I think that example says more about the company that chose to
| put that code as a demo in their _homepage_.
| mindcrime wrote:
| Very cool. I'm especially happy to see that there is an Eclipse
| client[1]. One note though: I had to dig around a bit to find the
| info about the Eclipse client. It's not mentioned in the main
| readme, or in the list of IDE extensions in the docs. Not sure if
| that's an oversight or because it's not "ready for prime time"
| yet or what.
|
| [1]:
| https://github.com/TabbyML/tabby/tree/3bd73a8c59a1c21312e812...
| mlepath wrote:
| Awesome project! I love the idea of not sending my data to a big
| company and trust their TOS.
|
| The effectiveness of coding assistant is directly proportional to
| context length and the open models you can run on your computer
| are usually much smaller. Would love to see something more
| quantified around the usefulness on more complex codebases.
| fullstackwife wrote:
| I hope for proliferation of 100% local coding assistants, but
| for now the recommendation of "Works best on $10K+ GPU" is a
| show stopper, and we are forced to use the "big company". :(
| danw1979 wrote:
| It's not really that bad. You can run some fairly big models
| on an Apple Silicon machine costing PS2k (M4 Pro Mac Mini
| with 64GB RAM).
| trevor-e wrote:
| fyi the pricing page has a typo for "Singel Sign-On"
| wsxiaoys wrote:
| Appreciated! Fixed
| nbzso wrote:
| I will go out on a limb and predict that in the next 10 years AI
| code assistant will be forbidden:)
| qwertox wrote:
| > Toggle IDE / Extensions telemetry
|
| Cannot be turned off in the Community Edition. What does this
| telemetry data contain?
| andypants wrote:
| struct HealthState { model: String,
| chat_model: Option<String>, device: String,
| arch: String, cpu_info: String,
| cpu_count: usize, cuda_devices: Vec<String>,
| version: Version, webserver: Option<bool>,
| }
|
| https://tabby.tabbyml.com/docs/administration/usage-collecti...
| nikkwong wrote:
| Maybe a good product but terrible company to interview with. I
| went through several rounds and was basically ghosted after the
| 4th with no explanation or follow up. The last interview was to
| write an blog post for their blog which I submitted and then
| didn't hear back until continuously nagging months later. It was
| pretty disheartening since all of the interviews were some form
| of a take-home and I spent a combined total of ~10 hours or more.
| NetOpWibby wrote:
| Were you at least paid?
| swyx wrote:
| you know that paid interview processes are not the norm, "at
| least" is unlikely
| nikkwong wrote:
| If I was paid, I probably wouldn't be complaining publicly.
| :-) It's probably better for both interests if these types
| of engagements are paid.
| fhd2 wrote:
| I've worked with paid take home tests for a while, but
| stopped again. Hiring managers started to make the
| assignments more convoluted, i.e. stopped respecting the
| candidate's time. Candidates, on the flip side, always
| said they don't want to bother with the bureaucracy of
| writing an invoice and reporting it for their taxes etc.,
| so didn't want to be paid.
|
| Now my logic is: If a take home test is designed to take
| more than two hours, we need to redesign it. Two hours of
| interviews, two hours of take home test, that ought to
| suffice.
|
| If we're still unsure after that, I sometimes offered the
| candidate a time limited freelance position, paid
| obviously. We've ended up hiring everyone who went into
| that process though.
| avandekleut wrote:
| I just finished interviewing with a company called
| Infisical. The take-homes were crazy (the kind of thing
| that normally takes a few days or a week). I was paid but
| it took me 12 hours.
| csomar wrote:
| > The last interview was to write an blog post for their blog
|
| Where you applying as a Software Dev.? Because that's not a
| software (or an interview) assignment.
| nikkwong wrote:
| Yes I was applying for software engineer. I think they wanted
| engineers who were good at explaining the product to users.
| csomar wrote:
| Sure. Writing and a good command of the language is
| important. There are multiple ways to showcase that.
| Writing a blog post for _their_ blog is not one of them.
| nikkwong wrote:
| I was willing to jump through hoops--I really wanted the
| job.
| 55555 wrote:
| Did the blog post get published on their blog?
| j45 wrote:
| Hope they paid for the work.
| aitchnyu wrote:
| Did their engineers spend time with you or did they get their
| blog post otherwise? I once made 1 minute videos for interview
| process of an AI training data company. I have a hunch they
| were just harvesting the data.
| nikkwong wrote:
| They did get the blog post but I don't believe they used it;
| it's possible that they didn't think it was well written and
| that's why I was ghosted but I will never know. I know they
| were interviewing many very talented people for the position.
| It's okay to be disorganized as a startup, but I think that
| keeping people happy, employee or otherwise, should always be
| the top priority. It would have taken just a few seconds to
| write an email to me to reject me, and by not doing so, this
| comment has probably evolved into a big nightmare for them. I
| didn't expect it to get this much attention, but yeah; I
| guess my general sentiment is shared by many.
| lgrapenthin wrote:
| Such interview processes are big red flags. The company can't
| afford taking a risk with you and at the same time tests how
| desperate you are by making you work for free. They are likely
| short on cash and short on experience. Expect crunch and bad
| management. Run.
| redwood wrote:
| Did they post the blog publicly?
| jejeyyy77 wrote:
| your first mistake was doing any kind of take-home exercise at
| all.
| jph wrote:
| IMHO companies should aim for courteous interviews, with faster
| decisions, and if there's any take home work then it's fully
| paid. I've seen your work at Beaver.digital and on
| GetFractals.com. If you're still looking, feel free to contact
| me; I'm hiring for a startup doing AI/ML data analysis.
| Specifically Figma + DaisyUI + TypeScript + Python + Pandas +
| AWS + Postgres.
| chvid wrote:
| All the examples are for code that would otherwise be found in a
| library. Some of the code is of dubious quality.
|
| LLMs - a spam bot for your codebase?
| leke wrote:
| I'm currently investigating a self hosted AI solution for my
| workplace.
|
| I was wondering, how does this company make money?
|
| From the pricing there is a free/community/opensource option, but
| how is the "up to 5 users" monitored?
|
| https://www.tabbyml.com/pricing
|
| * Up to 5 users
|
| * Local deployment
|
| * Code Completion, Answer Engine, In-line chat & Context Provider
|
| What if we have more than 5 users?
| rirze wrote:
| Are you asking on a public forum, on how to get around using a
| product for a commercial setting by using the non-commercial
| version of the product?
| leke wrote:
| I'm saying I don't understand their open source model. I
| thought open source meant you could use and modify code and
| run it yourself without having to pay a license. ie
| completely independent of the maintainer. So I was confused
| by this limit of how many were allowed to use something you
| are running yourself.
| SirMaster wrote:
| All these things that claim to be an alternative to GitHub
| Copilot, none of them seem to work in VS2022... So how is it
| really an alternative?
|
| All I want is a self-hosted AI assistant for VS2022. VS2022
| supports plugins yes, so what gives?
| jimmydoe wrote:
| Not using VSCode, would be great to have Sublime Text or Zed
| support.
| larwent wrote:
| I've been using something similar called Twinny. It's a vscode
| extension that connects to an ollama locally hosted LLM of your
| choice and works like CoPilot.
|
| It's an extra step to install Ollama, so not as plugnplay as tfa
| but the license is MIT which makes it worthwhile for me.
|
| https://github.com/twinnydotdev/twinny
___________________________________________________________________
(page generated 2025-01-13 23:01 UTC)