[HN Gopher] How I code with AI on a budget/free
___________________________________________________________________
How I code with AI on a budget/free
Author : indigodaddy
Score : 543 points
Date : 2025-08-09 22:27 UTC (1 days ago)
(HTM) web link (wuu73.org)
(TXT) w3m dump (wuu73.org)
| CjHuber wrote:
| Without tricks google aistudio definitely has limits, though
| pretty high ones. gemini.google.com on the other hand has less
| than a handful of free 2.5 pro messages for free
| GaggiX wrote:
| OpenAI offering 2.5M free tokens daily small models and 250k for
| big ones (tier 1-2) is so useful for random projects, I use them
| to learn japanese for example (by having a program that list
| informations about what the characters are just saying:
| vocabulary, grammar points, nuances).
| cammikebrown wrote:
| I wonder how much energy this is wasting.
| bravesoul2 wrote:
| Untradable carbon tax (or carbon price for people who hate the
| T word) is needed.
| robotsquidward wrote:
| Right - free to _you_ maybe.
| yen223 wrote:
| Probably not as much as you think:
| https://www.sustainabilitybynumbers.com/p/ai-energy-demand
|
| You are better off worrying about your car use and your home
| heating/cooling efficiency, all of which are significantly
| worse for energy use.
| kasabali wrote:
| > You'll notice that this figure is for 2022, and we've had a
| major AI boom since then
|
| I might as well read LLM gibberish instead of this article.
| sergiotapia wrote:
| who cares. we can build more. energymaxx or the us will become
| like germany.
| Havoc wrote:
| For anyone else confused - there is a page 2 and 3 in the post
| that you need to access via arrow thing at bottom.
| andai wrote:
| My experience lines up with the article. The agentic stuff only
| works with the biggest models. (Well, "works"... OpenAI Codex
| took 200 requests with o4-mini to change like 3 lines of code...)
|
| For simple changes I actually found smaller models better because
| they're so much faster. So I shifted my focus from "best model"
| to "stupidest I can get away with".
|
| I've been pushing that idea even further. If you give up on
| agentic, you can go surgical. At that point even 100x smaller
| models can handle it. Just tell it what to do and let it give you
| the diff.
|
| Also I found the "fumble around my filesystem" approach stupid
| for my scale, where I can mostly fit the whole codebase into the
| context. So I just dump src/ into the prompt. (Other people's
| projects are a lot more boilerplatey so I'm testing ultra cheap
| models like gpt-oss-20b for code search. For that, I think you
| can go even cheaper...)
|
| Patent pending.
| statenjason wrote:
| Aider as a non-agentic coding tool strikes a nice balance on
| the efficiency vs effectiveness front. Using tree-sitter to
| create a repo map of the repository means less filesystem
| digging. No MCP, but shell commands mean it can use utilities I
| myself am familiar with. Combined with Cerebras as a provider,
| the turnaround on prompts is instant; I can stay involved
| rather than waiting on multiple rounds of tool calls. It's my
| go-to for smaller scale projects.
| mathiaspoint wrote:
| It's a shame MCP didn't end up using a sandboxed shell (or
| something similar, maybe even simpler.) All the pre-MCP
| agents I built just talked to the shell directly since the
| models are already trained to do that.
| stillsut wrote:
| Just added a fork of aider that _does_ do agentic commands:
| https://github.com/sutt/agent-aider
|
| In testing I've found it to be underwhelming at being an
| agent compared to claude code, wrote up some case-studies on
| it here: https://github.com/sutt/agro/blob/master/docs/case-
| studies/a...
| SV_BubbleTime wrote:
| > (Well, "works"... OpenAI Codex took 200 requests with o4-mini
| to change like 3 lines of code...)
|
| Let's keep something in reason, I have multiple times in my
| life spent days on what would end up to be maybe three lines of
| code.
| wahnfrieden wrote:
| They don't allow model switching below GPT-5 in codex cli
| anymore (without API key), because it's not recommended. Try it
| with thinking=high and it's quite an improvement from o4-mini.
| o4-mini is more like gpt-5-thinking-mini but they don't allow
| that for codex. gpt-5-thinking-high is more like o1 or maybe
| o3-pro.
| hpincket wrote:
| I am developing the same opinion. I want something fast and
| dependable. Getting into a flow state is important to me, and I
| just can't do that when I'm waiting for an agentic coding
| assistant to terminate.
|
| I'm also interested in smaller models for their speed. That, or
| a provider like Cerebras.
|
| Then, if you narrow the problem domain you can increase the
| dependability. I am curious to hear more about your "surgical"
| tools.
|
| I rambled about this on my blog about a week ago:
| https://hpincket.com/what-would-the-vim-of-llm-tooling-look-...
| radio879 wrote:
| well, most of the time, I just dump the entire codebase in if
| the context window is big and its a good model. But there are
| plenty of times when I need to block one folder in a repo or
| disable a few files because the files might "nudge" it in a
| wrong direction.
|
| The surgical context tool (aicodeprep-gui) - there are at
| least 30 similar tools but most (if not all) are CLI only/no
| UI. I like UIs, I work faster with them for things like
| choosing individual files out of a big tree (at least it is
| using PySide6 library which is "lite" (could go lighter
| maybe), i HATE that too many things use webview/browsers. All
| the options on it are there for good reasons, its all focused
| on things that annoy me..and slow things down: like doing
| something repeatedly (copy paste copy paste or typing the
| same sentence over and over every time i have to do a certain
| thing with the AI and my code.
|
| If you have not run 'aicp' (the command i gave it, but also
| there is a OS installer menu that will add a
| Windows/Mac/Linux right click context menu in their file
| managers) in a folder before, it will try to scan recursively
| to find code files, but it skips things like node_modules or
| .venv. but otherwise assumes most types of code files will
| probably be added so it checks them. You can fine tune it,
| add some .md or txt files or stuff in there that isn't code
| but might be helpful. When you generate the context block it
| puts the text inside the prompt box on the top AND/OR bottom
| - doing both can get better responses from AI.
|
| It saves every file that is checked, and saves the window
| size, other window prefs, so you don't have to resize the
| window again. It saves the state of which files are checked
| so its less work / time next time. I have been just pasting
| the output from the LLMs into an agent like Cline but I am
| wondering if I should add browser automation / browser
| extension that does the copy pasting and also add option to
| edit / change files right after grabbing the output from a
| web chat. Its probably about good enough as it is though, not
| sure I want to make it into a big thing.
|
| --- Yeah I just keep coming back to this workflow, its very
| reliable. I have not tried Claude Code yet but I will soon to
| see if they solved any of these problems.
|
| Strange this thing has been at the top of hacker news for
| hours and hours.. weird! My server logs are just constant
| scrolling
| dist-epoch wrote:
| Thanks for the article. I'm also doing a similar thing,
| here are my tips:
|
| - https://chutes.ai - 200 requests per day if you deposit
| (one-time) $5 for top open weights models - GLM, Qwen, ...
|
| - https://github.com/marketplace/models/ - around 10
| requests per day to o3, ... if you have the $10 GitHub
| Copilot subsciption
|
| - https://ferdium.org - I open all the LLM webapps here as
| separate "apps", my one place to go to talk with LLMs,
| without mixing it with regular browsing
|
| - https://www.cherry-ai.com - chat API frontend, you can
| use it instead of the default webpages for services which
| give you free API access - Google, OpenRouter, Chutes,
| Github Models, Pollinations, ...
|
| I really recommend trying a chat API frontend, it really
| simplifies talking with multiple models from various
| providers in a unified way and managing those
| conversations, exporting to markdown, ...
| hpincket wrote:
| aicodeprep-gui looks great. I will try it out
| indigodaddy wrote:
| Have you seen this?
| https://github.com/robertpiosik/CodeWebChat
| chewz wrote:
| I agree. I find even Haiku good enough at managing the flow of
| the conversation and consulting larger models - Gemini 2.5 Pro
| or GPT-5 - for programming tasks.
|
| Last few days I am experimenting with using Codex (via MCP
| ${codex mcp}) from Gemini CLI and it works like a charm. Gemini
| CLI is mostly using Flash underneath but this is good enough
| for formulating problems and re-evaluating answers.
|
| Same with Claude Code - I am asking (via MCP) for consulting
| with Gemini 2.5 Pro.
|
| Never had much success of using Claude Code as MCP though.
|
| The original idea comes of course from Aider - using main, weak
| and editor models all at once.
| seunosewa wrote:
| You should try GLM 4.5; it's better in practice than Kimi K2
| and Qwen3 Coder, but it's not getting much hype.
| mathiaspoint wrote:
| I use a 500 million parameter model for editor completions
| because I want those to nearly instantaneous and the plugin
| makes 50+ completion requests every session.
| myflash13 wrote:
| Which model and which plugin, please?
| ghxst wrote:
| What editor do you use, and how did you set it up? I've been
| thinking about trying this with some local models and also
| with super low-latency ones like Gemini 2.5 Flash Lite. Would
| love to read more about this.
| mathiaspoint wrote:
| Neovim with the llama.cpp plugin and heavily quantized
| qwen2.5-coder with 500 (600?) million parameters. It's
| almost plug and play although the default ring context
| limit is _way_ too large if you don 't have a GPU.
| badlogic wrote:
| Can you share which model you are using?
| reactordev wrote:
| To the OP: I highly recommend you look into Continue.dev and
| ollama/lmstudio and running models on your own. Some of them are
| really good at autocomplete-style suggestions while others (like
| gpt-oss) can reason and use tools.
|
| It's my goto copilot.
| navbaker wrote:
| Same! I've been using Continue in VSCode and found most of the
| bigger Qwen models plus gpt-oss-120b to be great in agentic
| mode!
| indigodaddy wrote:
| Do you use openrouter models with continue?
| AstroBen wrote:
| I've found Zed to be a step up from continue.dev - you can use
| your own models there also
| indigodaddy wrote:
| Can you use your GH Copilot subscription with Zed to leverage
| the Copilot subscription-provided models?
| nechuchelo wrote:
| Yes, you can. IIRC both for the assistant/agent and code
| completions.
| reactordev wrote:
| Zed is supreme but I have a need that Zed can't scratch so
| I'm in VSCode :(
| radio879 wrote:
| really - no monthly subscriptions? i hate those but i am fine
| with bringing my own API URLs etc and paying. I'm building a
| router that will track all the free tokens from all the
| different providers and auto rotate them when daily tokens or
| time limits run out.
|
| Continue and Zed.. gonna check them out, prompts in Cline are
| too long. I was thinking of just making my own VS Code
| extension but I need to try Claude Code with GLM 4.5 (heard
| it pairs nicely)
| chromaton wrote:
| If you're looking for free API access, Google offers access to
| Gemini for free, including for gemini-2.5-pro with thinking
| turned on. The limit is... quite high, as I'm running some
| benchmarking and haven't hit the limit yet.
|
| Open weight models like DeepSeek R1 and GPT-OSS are also made
| available with free API access from various inference providers
| and hardware manufacturers.
| gooosle wrote:
| Gemini 2.5 pro free limit is 100 requests per day.
|
| https://ai.google.dev/gemini-api/docs/rate-limits
| tomrod wrote:
| Doesn't it swap to a lower power model after that?
| acjacobson wrote:
| Not automatically but you can switch to a lower power model
| and access more free requests. I think Gemini 2.5 Flash is
| 250 requests per day.
| panarky wrote:
| I'm getting consistently good results with Gemini CLI and the
| free 100 requests per day and 6 million tokens per day.
|
| Note that you'll need to either authorize with a Google
| Account or with an API key from AI Studio, just be sure the
| API key is from an account where billing is disabled.
|
| Also note that there are other rate limits for tokens per
| request and tokens per minute on the free plan that
| effectively prevent you from using the whole million token
| context window.
|
| It's good to exit or /clear frequently so every request
| doesn't resubmit your entire history as context or you'll use
| up the token limits long before you hit 100 requests in a
| day.
| chiwilliams wrote:
| I'm assuming it isn't sensitive for your purposes, but note
| that Google will train on these interactions, but not if you
| pay.
| unnouinceput wrote:
| I agree, Google is definitely the champion of respecting your
| privacy. Will definitely not train their model on your data
| if you pay them. I mean you should definitely just film
| yourself and give them everything, access to your files,
| phone records, even bank accounts. Just make sure to pay them
| those measly $200 and absolutely they will not share that
| data with anybody.
| lern_too_spel wrote:
| You're thinking of Facebook. A lot of companies run on
| Gmail and Google Docs (easy to verify with `dig MX
| [bigco].com`), and they would not if Google shared that
| data with anybody.
| d1sxeyes wrote:
| It's not really in either Meta or Google's interests to
| _share_ that data. What they _do_ is to build super
| detailed profiles of you and what you're likely to click
| on, so they can charge more money for ad impressions.
| lern_too_spel wrote:
| Meta certainly shares the data internally.
| https://www.techradar.com/computing/cyber-
| security/facebooks...
| wat10000 wrote:
| Big companies can negotiate their own terms and enforce
| them with meaningful legal action.
| devjab wrote:
| I think it'll be hard to find a LLM that actually respects
| your privacy regardless whether or not you pay. Even with the
| "privacy" enterprise Co-Pilot from Microsoft with all their
| promises of respecting your data, it's still not deemed safe
| enough by leglislation to be used in part of the European
| energy sector. The way we view LLM's on any subscription is
| similar to how I imagine companies in the USA views Deepseek.
| Don't put anything into them you can't afford to share with
| the world. Of course with the agents, you've probably given
| them access to everything on your disk.
|
| Though to be fair, it's kind of silly how much effort we go
| through to protect our mostly open source software from AI
| agents, while at the same time, half our OT has build in
| hardware backdoors.
| bongodongobob wrote:
| I don't care. From what I understand of LLM training, there's
| basically 0 chance a key or password I might send it will
| ever be regurgitated. Do you have any examples of an LLM
| actually doing anything like this?
| radio879 wrote:
| I am the person that wrote that. Sorry about the font. This is a
| bit outdated, AI stuff goes at high speed. More models so I will
| try to update that.
|
| Every month so many new models come out. My new fav is GLM-4.5...
| Kimi K2 is also good, and Qwen3-Coder 480b, or 2507 instruct..
| very good as well. All of those work really well in any agentic
| environment/in agent tools.
|
| I made a context helper app ( https://wuu73.org/aicp ) which is
| linked to from there which helps jump back and forth from all the
| different AI chat tabs i have open (which is almost always
| totally free, and I get the best output from those) to my IDE.
| The app tries to remove all friction, and annoyances, when you
| are working with the native web chat interfaces for all the AIs.
| Its free and has been getting great feedback, criticism welcome.
|
| It helps the going from IDE <----> web chat tabs. Made it for
| myself to save time and I prefer the UI (PySide6 UI so much
| lighter than a webview)
|
| Its got Preset buttons to add text that you find yourself typing
| very often, per-project state saves of window size of app and
| which files were used for context. So next time, it opens at same
| state.
|
| Auto scans for code files, guesses likely ones needed, prompt box
| that can put the text above and below the code context (seems to
| help make the output better). One of my buttons is set to: "Write
| a prompt for Cline, the AI coding agent, enclose the whole prompt
| in a single code tag for easy copy and pasting. Break the tasks
| into some smaller tasks with enough detail and explanations to
| guide Cline. Use search and replace blocks with plain language to
| help it find where to edit"
|
| What i do for problem solving, figuring out bugs: I'm usually in
| VS Code and i type aicp in terminal to open the app. Fine tune
| any files already checked, type what i am trying to do or what
| problem i have to fix, click Cline button, click Generate
| Context!. Paste into GLM-4.5, sometimes o3 or o4-mini, GPT-5,
| Gemini 2.5 Pro.. if its a super hard thing i'll try 2 or 3
| models. I'll look and see which one makes the most sense and just
| copy and paste into Cline in VS Code - set to GPT 4.1 which is
| unlimited/free.. 4.1 isn't super crazy smart or anything but it
| follows orders... it will do whatever you ask, reliably. AND, it
| will correct minor mistakes from the bigger model's output. The
| bigger smarter models can figure out the details, and they'll
| write a prompt that is a task list with how-to's and why's
| perfect for 4.1 to go and do in agent mode....
|
| You can code for free this way unlimited, and its the smartest
| the models will be. Anytime you throw some tools or MCPs at a
| model it dumbs them down.... AND you waste money on all the API
| costs having to use Claude 4 for everything
| indigodaddy wrote:
| Is glm-4.5 air useable? I see it's free on Openrouter. Also pls
| advise what you think is the current best free openrouter model
| for coding. Thanks!
| radio879 wrote:
| Well, if you download Qwen Code
| https://github.com/QwenLM/qwen-code it is free up to 2000 api
| calls a day.
|
| Not sure if GLM-4.5 Air is good, but non-Air one is fabulous.
| I know for free API access there is pollinations ai project.
| Also llm7. If you just use the web chat's you can use most of
| the best models for free without API. There are ways to
| 'emulate' an API automatically.. I was thinking about adding
| this to my aicodeprep-gui app so it could automatically paste
| and then cut. Some MCP servers exist that you can use and it
| will automatically paste or cut from those web chat's and
| route it to an API interface.
|
| OpenAI offers free tokens for most models, 2.5mil or 250k
| depending on model. Cerebras has some free limits, Gemini...
| Meta has plentiful free API for Llama 4 because.. lets face
| it, it sucks, but it is okay/not bad for stuff like
| summarizing text.
|
| If you really wanted to code for exactly $0 you could use
| pollinations ai, in Cline extension (for VS Code) set to use
| "openai-large" (which is GPT 4.1). If you plan using all the
| best web chat's like Kimi K2, z.ai's GLM models, Qwen 3 chat,
| Gemini in AI Studio, OpenAI playground with o3 or o4-mini.
| You can go forever without being charged money. Pollinations
| 'openai-large' works fine in Cline as an agent to edit files
| for you etc.
| indigodaddy wrote:
| Very cool, a lot to chew on here. Thanks so much for the
| feedback!
| tonyhart7 wrote:
| bro you are final boss of free tier users lol
| radio879 wrote:
| damn right !!!!
| hgarg wrote:
| Qwen is totally useless any serious dev work.
| b2m9 wrote:
| It's really hit and miss for me. Well defined small tasks
| seem ok. But every time I try some "agentic coding", it burns
| through millions of tokens without producing anything
| working.
| simonw wrote:
| Which Qwen? They have over a dozen models now.
| racecar789 wrote:
| Small recommendation: The diagrams on [https://wuu73.org/aicp]
| are helpful, but clicking them does not display the full-
| resolution images; they appear blurry. This occurs in both
| Firefox and Chrome. In the GitHub repository, the same images
| appear sharp at full resolution, so the issue may be caused by
| the JavaScript rendering library.
| PeterStuer wrote:
| Another data point: On Android Chrome they render without
| problem.
| radio879 wrote:
| thx - i did not know that. Will try to fix.
| PeterStuer wrote:
| Very nice article and thx for the update.
|
| I would be very interested in an in dept of your experiences of
| differences between Roo Code and Cline if you feel you can
| share that. I've only tried Roo Code (with interesting but
| mixed results) thus far.
| pyman wrote:
| Just use lmstudio.ai, it's what everyone is using nowadays
| pyman wrote:
| LM Studio is awesome
| simonw wrote:
| LM Studio is great, but it's a very different product from
| an AI-enabled IDE or a Claude Code style coding agent.
| teiferer wrote:
| > You can code for free this way
|
| vs
|
| > If you set your account's data settings to allow OpenAI to
| use your data for model training
|
| So, it's not "for free".
| bahmboo wrote:
| I was going to downvote you but you are adding to the
| discussion. In this context this is free from having to spend
| money. Many of us don't have the option to pay for models. We
| have to find some way to get the state of the art without
| spending our food money.
| frankzander wrote:
| Hm why pay for something when I can get it for free? Being
| miserly is a skill that can save a lot of money.
| hx8 wrote:
| I live a pretty frugal life, and reached the FI part of
| FIRE in my early 30s as an averagely compensated software
| engineer.
|
| I am very skeptical anytime something is 'free'. I
| specifically avoid using a free service when the company
| profits from my use of the service. These arrangements
| usually start mutually beneficial, and almost always
| become user hostile.
|
| Why pay for something when you can get it for free?
| Because the exchange of money for service sets clear
| boundaries and expectations.
| ta1243 wrote:
| I don't trust any AI company not to use and monetise my
| data, regardless how much I pay or regardless what their
| terms of service say. I know full well that large companies
| ignore laws with impunity and no accountability.
| simonw wrote:
| I would encourage you to rethink this position just a
| little bit. Going through life not trusting any company
| isn't a fun way to live.
|
| If it helps, think about those company's own selfish
| motivations. They like money, so they like paying
| customers. If they promise those paying customers (in
| legally binding agreements, no less) that they won't
| train on their data... and are then found to have trained
| on their data anyway, they wont just lose that customer -
| they'll lose thousands of others too.
|
| Which hurts their bottom line. It's in their interest
| _not_ to break those promises.
| alpaca128 wrote:
| > they wont just lose that customer - they'll lose
| thousands of others too
|
| No, they won't. And that's the problem in your argument.
| Google landed in court for tracking users in incognito
| mode. They also were fined for not complying with the
| rules for cookie popups. Facebook lost in court for
| illegally using data for advertising. Did it lose them
| any paying customer? Maybe, but not nearly enough for
| them to even notice a difference. The larger outcome was
| that people are now more pissed at the EU for cookie
| popups that make the greed for data more transparent.
| Also in the case of Google most money comes from
| different people than the ones that have their privacy
| violated, so the incentives are not working as you
| suggest.
|
| > Going through life not trusting any company isn't a fun
| way to live
|
| Ignoring existing problems isn't a recipe for a happy
| life either.
| simonw wrote:
| Landing in court is an expensive thing that companies
| don't want to happen.
|
| Your examples also differ from what I'm talking about.
| Advertising supported business models have a different
| relationship with end users.
|
| People getting something for free are less likely to
| switch providers over a privacy concern compared with
| companies is paying thousands of dollars a month (or
| more) for a paid service under the understanding that it
| won't train on their data.
| teiferer wrote:
| I appreciate your consideration, disagree != downvote.
|
| To your point, "free from having to spend money" is exactly
| it. It's paid for with other things, and I get that some
| folks don't care. But being more open about this would be
| nice. You don't typically hide a monetary cost either, and
| everybody trying to do that is rightfully called out on it
| by being called a scam. Doing that with non-monetary costs
| would be a nice custom.
| can16358p wrote:
| Many folks, especially if they are into getting things free,
| don't really care much about privacy narrative.
|
| So yes, it is free.
| Wilder7977 wrote:
| This is not only a privacy concern (in fact, that might be
| a tiny part since the code might end up public anyway?).
| There is an element of disclosure of personal data, there
| are ownership issues in case that code was not - in fact -
| going to be public and more.
|
| In any case, not caring about the cost (at a specific time)
| doesn't make the cost disappear.
| greggsy wrote:
| The point they are making is, that some people know that,
| and are not as concerned as others about it.
| teiferer wrote:
| Not being concerned doesn't make the statement "it's
| free" more true.
| worik wrote:
| I understand. I get the point. I disagree
|
| Privacy absolutely does not matter, until it does, and then
| it is too late
| barrell wrote:
| Plenty of people can also afford to subscribe to these
| without any issue. They don't even know the price, they
| probably won't even cancel it when they stop using it as
| they might not even realize they have a subscription.
|
| By your logic, are the paid plans not sometimes free?
| throwaway83711 wrote:
| While it is true that sometimes you are the product even
| if you're paying, I don't think anyone is trying to argue
| that obviously paid plans are free.
| astrobe_ wrote:
| Sophistry. "many" according to which statistic? And just
| because some people consider that a trade is very favorable
| for them, doesn't it is not a trade and it doesn't mean
| they are correct - who's so naive they can beat business
| people at their own game?
| astrobe_ wrote:
| they +think they+ can beat business people
| 1dom wrote:
| > So yes, it is free.
|
| This sounds pedantic, but I think it's important to spell
| this out: this sort of stuff is only free if you consider
| what you're producing/exchanging for it to have 0 value.
|
| If you consider what you're producing as valuable, you're
| giving it away to companies with an incentive to extract as
| much value from your thing as possible, with little regard
| towards your preferences.
|
| If an idiot is convinced to trade his house for some magic
| beans, would you still be saying "the beans were free"?
| radio879 wrote:
| I should add a section to the site/guide about privacy,
| just letting people know they have somewhat of a choice
| with that.
|
| As for sharing code, most of the parts of a
| project/app/whatever have already been done and if an
| experienced developer hears what your idea is, they could
| just make it and figure it out without any code. The code
| itself doesn't really seem that valuable (well..
| sometimes). Someone can just look at a screenshot of my
| aicodeprep app and just make one and make it look the
| same too.
|
| Not all the time of course - If I had some really unique
| sophisticated algorithms that I knew almost no one else
| would or has figured out, I would be more careful.
|
| Speaking of privacy.. a while back a thought popped into
| my head about Slack, and all these unencrypted chat's
| businesses use. It kinda does seem crazy to do all your
| business operations over unencrypted chat, Slack rooms..
| I personally would not trust Zuckerberg to not look in
| there and run lots of LLMs through all the conversations
| to find anything 'good'! Microsoft.. kinda doubt would do
| that on purpose but what's to stop a rogue employee from
| finding out some trade secrets etc.. I'd be suprised if
| it hasn't been done. Security is not usually a priority
| in tech. They half-ass care about your personal info.
| bayarearefugee wrote:
| I understand the point people are trying to make with
| this argument, but we are so far into a nearly universal
| scam economy where corporations see small (relative to
| their costs of business) fines as just part of normal
| expenses that I also think anyone who really believes the
| AI companies aren't using their data to train models,
| even if it is against their terms, is wildly naive.
| motoxpro wrote:
| I don't think that's true. It's not that has zero value,
| it's that it has zero monetizable value.
|
| Hackernews is free. The posts are valuable to me and I
| guess my posts are valuable to me, but I wouldn't pay for
| it and I definitely don't expect to get paid.
|
| For YC, you are producing content that is "valuable" that
| brings people to their site, which they monetize through
| people signing up for their program. They do this with no
| regard for what your preferences are when they choose
| companies to invest in.
|
| They sell ads (Launch, Hire, etc.) against the attention
| that you create. You ARE the product on HackerNews, and
| you're OK with it. As am I.
|
| Same as OpenAI, I dont need to monetize them training on
| my data, and I am happy for you to as I would like to use
| the services for free.
| coliveira wrote:
| Tech companies are making untold fortunes from
| unsophisticated people like you.
| throwaway83711 wrote:
| It's a transaction--a trade. You give them your personal
| data, and you get their services in exchange.
|
| So no, it's not free.
| ya3r wrote:
| Have you seen Microsoft's copilot? It is essentially free
| openai models
| T4iga wrote:
| And to anyone who has ever used it, it appears more like
| opening smoothbrain. For a long time it was the only allowed
| model at work and even for basic cyber security questions it
| was sometimes completely useless.
|
| I would not recommend it to anyone.
| simonw wrote:
| Which of their many Copilot products do you mean?
| cropcirclbureau wrote:
| Note that the website is scrolling very slow, sub1-fps on
| Firefox Android. I'm also unable to scroll the call-out about
| grok. Also, there's this strange large green button reading CSS
| loaded at the top.
| subscribed wrote:
| I scroll just fine on Vanadium, Duck browser and brave.
| oblio wrote:
| On Android?
| morsch wrote:
| Works fine, Firefox Android 142.0b9
| dcuthbertson wrote:
| FYI: the first AI you link to, " z.ai's GLM 4.5", actually
| links to zai.net, which appears to be a news site, instead of
| "chat.z.ai", which is what I think you intended.
| battxbox wrote:
| Fun fact, zai[.]net seems to be an italian school magazine.
| As an italian I've never known about it, but the words pun
| got me laughing.
|
| zai[.]net -> zainet -> zainetto -> which is the italian word
| for "little school backback"
| radio879 wrote:
| oops. was using AI trying to fix some of the bugs and update
| it real fast with some newer models, since this post was
| trending here. Hopefully its scrolling better. Link fixed. I
| know its still ridiculous looking with some of the page but
| at least its readable for now.
| maxiepoo wrote:
| do you really have 20+ tabs of LLMs open at a time?
| radio879 wrote:
| some days.. it varies but a whole browser window is dedicated
| to it and always open
| tummler wrote:
| Anecdotal, but Grok seems to have just introduced pretty
| restrictive rate limits. They're now giving free users access
| to Grok 4 with a low limit and then making it difficult to
| manually switch to Grok 3 and continue. Will only allow a few
| more requests before pushing an upgrade to paid plans. Just
| started happening to me last night.
| stuart73547373 wrote:
| (relevant self promotion) i wrote a cli tool called slupe that
| lets web based llm dictate fs changes to your computer to make
| it easier to do ai coding from web llms
| https://news.ycombinator.com/item?id=44776250
| VagabundoP wrote:
| I tried Cline with chatgpt 4.1 and I was charged - there are
| some free credits when you sign up for Cline that it used.
|
| Not sure how you got it for free?
| busymom0 wrote:
| I built a relevant tool (approved by Apple this week) which may
| help reduce the friction of you having to constantly copy paste
| text between your app and the AI assistant in browser.
|
| It's called SelectToSearch and it reduces my friction by 85% by
| automating all those copy paste etc actions with a single
| keyboard shortcut:
|
| https://apps.apple.com/ca/app/select-to-search-ai-assistant/...
| bravesoul2 wrote:
| Windsurf has a good free model. Good enough for autocomplete
| level work for sure (haven't tried it for more as I use Claude
| Code)
| indigodaddy wrote:
| Assuming you have to at least be logged into a windsurf account
| though?
| bravesoul2 wrote:
| Yeah. I didn't see not logged in as a requirement.
| b2m9 wrote:
| You mean SWE-1? I used it like a dozen times and I gave up
| because the responses were so bad. Not even sure whether it's
| good enough for autocomplete because it's the slowest model
| I've tested in a while.
| bravesoul2 wrote:
| Not my experience for slowness. For smartness I am typically
| using it for simple "not worth looking that up" stuff rather
| than even feature implementation. Got it to write some MySQL
| SQL today, for example.
| andrewmcwatters wrote:
| I jump between Claude Sonnet 4 on GitHub Copilot Pro and now
| GPT-5 on ChatGPT. That seems to get me pretty far. I have gpt-
| oss:20b installed with ollama, but haven't found a need to use it
| yet, and it seems like it just takes too long on an M1 Max
| MacBook Pro 64GB.
|
| Claude Sonnet 4 is pretty exceptional. GPT-4.1 asks me too
| frequently if it wants to move forward. Yes! Of course! Just do
| it! I'll reject your changes or do something else later. The
| former gets a whole task done.
|
| I wonder if anyone is getting better results, or comparable for
| cheaper or free. GitHub Copilot in Visual Studio Code is so good,
| I think it'd be pretty hard to beat, but I haven't tried other
| integrated editors.
| joshdavham wrote:
| > When you use AI in web chat's (the chat interfaces like AI
| Studio, ChatGPT, Openrouter, instead of thru an IDE or agent
| framework) are almost always better at solving problems, and
| coming up with solutions compared to the agents like Cline, Trae,
| Copilot.. Not always, but usually.
|
| I completely agree with this!
|
| While I understand that it looks a little awkward to copy and
| paste your code out of your IDE and into a web chat interface, I
| generally get better results that way than with GitHub copilot or
| cursor.
| SV_BubbleTime wrote:
| 100% opposite experience.
|
| Whether agentic, not... it's all about context.
|
| Either agentic with access to your whole project, "lives" in
| GitHub, a fine tune, or RAG, or whatever... having access to
| all of the context drastically reduces hallucinations.
|
| There is a big difference between "write x" and "write x for me
| in my style, with all y dependencies, and considering all z
| code that exists around it".
|
| I'm honestly not understand a defense of copy and paste AI
| coding... this is why agents are so massively popular right
| now.
| b2m9 wrote:
| I'm also surprised by this take. I found copy/paste between
| editor and external chats to be way less helpful.
|
| That being said, I think everyone has probably different
| expectations and workflows. So if that's what works for them,
| who am I to judge?
| chazhaz wrote:
| Agreed that it's all about context -- but my experience is
| that pasting into web chat allows me to manage context much
| more than if I drop the whole project/whole filesystem into
| context. With the latter approach the results tend to be hit-
| and-miss as the model tries to guess what's right. All about
| context!
| hgarg wrote:
| Just use Rovodev CLI. Gives you 20 million tokens for free per 24
| hours and you can switch between sonnet 4 / gpt-5.
| sumedh wrote:
| What is the catch?
| ireadmevs wrote:
| > Beta technology disclaimer > Rovo Dev in the CLI is a beta
| product under active development. We can only support a
| certain number of users without affecting the top-notch
| quality and user experience we are known for providing. Once
| we reach this limit, we will create a waiting list and
| continue to onboard users as we increase capacity. This
| product is available for free while in beta.
|
| From https://community.atlassian.com/forums/Rovo-for-
| Software-Tea...
| indigodaddy wrote:
| Isn't this only available to a current Jira cloud/service
| subscription?
| xvv wrote:
| As of today, what is the best local model that can be run on a
| system with 32gb of ram and 24gb of vram?
| v5v3 wrote:
| Start with Qwen of a size that fits in the vram.
| ethan_smith wrote:
| DeepSeek Coder 33B or Llama 3 70B with GGUF quantization
| (Q4_K_M) would be optimal for your specs, with Mistral Large 2
| providing the best balance of performance and resource usage.
| fwystup wrote:
| Qwen3-Coder-30B-A3B-Instruct-FP8 is a good choice
| ('qwen3-coder:30b' when you use ollama). I have also had good
| experiences with https://mistral.ai/news/devstral (built under
| a collaboration between Mistral AI and All Hands AI)
| yichuan wrote:
| I think there's huge potential for a fully local "Cursor-like"
| stack -- no cloud, no API keys, just everything running on your
| machine.
|
| The setup could be: * Cursor CLI for agentic/dev stuff
| (example:https://x.com/cursor_ai/status/1953559384531050724) * A
| local memory layer compatible with the CLI -- something like
| LEANN (97% smaller index, zero cloud cost, full privacy,
| https://github.com/yichuan-w/LEANN) or Milvus (though Milvus
| often ends up cloud/token-based) * Your inference engine, e.g.
| Ollama, which is great for running OSS GPT models locally
|
| With this, you'd have an offline, private, and blazing-fast
| personal dev+AI environment. LEANN in particular is built exactly
| for this kind of setup -- tiny footprint, semantic search over
| your entire local world, and Claude Code/ Cursor -compatible out
| of the box, the ollama for generation. I guess this solution is
| not only free but also does not need any API.
|
| But I do agree that this need some effort to set up, but maybe
| someone can make these easy and fully open-source
| airtonix wrote:
| it might be free, private, blazing fast (if you choose a model
| with appropriate parameters to match your GPU).
|
| but you'll quickly notice that it's not even close to matching
| the quality of output, thought and reflecting that you'd get
| from running the same model but significantly high parameter
| count on a GPU capable of providing over 128gb of actual vram.
|
| There isn't anything available locally that will let me load a
| 128gb model and provide anything above 150tps
|
| The only thing that local ai model makes sense for right now
| seems to be Home Assistant in order to replace your google
| home/alexis.
|
| happy to be proven wrong, but the effort to reward just isn't
| there for local ai.
| PeterStuer wrote:
| Because most of the people squeezing that highly quantized
| small model into their consumer gpu don't get how they have
| left no room for the activation weights, and are stuck with a
| measly small context.
| andylizf wrote:
| Yeah, this seems a really fantastic summary of our ideal local
| AI stack. A powerful, private memory layer has always felt like
| the missing piece for tools like Cursor or aider.
|
| The idea of this tiny, private index like what the LEANN
| project describes, combined with local inference via Ollama, is
| really powerful. I really like this idea about using it in
| programming, and a truly private "Cursor-like" experience would
| be a game-changer.
| oblio wrote:
| You should probably disclose everywhere you comment that
| you're advertising for Leann.
| qustrolabe wrote:
| I bet it's crazy to some people that others okay with giving up
| so much of their data for free tiers. Like yeah it's better to
| selfhost but it takes so much resources to run good enough LLM at
| home that I'd rather give up my code for some free usage, anyway
| that code eventually will end up open source
| jama211 wrote:
| And as far as I'm concerned if my work is happy for me to use
| models to assist with code, then it's not my problem
| tonyhart7 wrote:
| I replicate SDD from kiro code, it works wonder for multi
| switching model because I can just re fetch from specs folder
| gexla wrote:
| Wow, there's a lot here that I didn't know about. Just never
| drilled that far into the options presented. For a change, I'm
| happy that I read the article rather than only the comments on
| HN. ;)
|
| And lots of helpful comments here on HN as well. Good job
| everyone involved. ;)
| sublinear wrote:
| This all sounds a lot more complicated and time consuming than
| just writing the damn code yourself.
| hoerzu wrote:
| To stop tab switching I built an extension to query all free
| models all at once: https://llmcouncil.github.io/llmcouncil/
| nolist_policy wrote:
| But isn't it in the extension store?
| unixfox wrote:
| Is it possible to have the source code? I see that there is a
| github icon at the bottom of the page but it doesn't work.
| bambax wrote:
| As the post says, the problem with coding agents is they send a
| lot of their own data + almost your entire code base for each
| request: that's what makes them expensive. But when used in a
| chat the costs are so low as to be insignificant.
|
| I only use OpenRouter which gives access to almost all models.
|
| Sonnet was my favorite until I tried Gemini 2.5 Pro, which is
| almost always better. It can be quite slow though. So for basic
| questions / syntax reminders I just use Gemini Flash: super fast,
| and good for simple tasks.
| worik wrote:
| A lot of work to evaluate these models. Thank you
| radio879 wrote:
| I don't like or love many things in life, but something about
| AI triggered that natural passion I had when I was first
| learning to code as a kid. Its just super fun. Coding without
| AI stopped being fun looong time ago. Unlucky brain or genetics
| maybe. AI sped up the dopamine feedback iteration loop to where
| my brain can really feel it again. I can get an idea in my head
| and just an hour later, have it 80% done and functioning. That
| gives me motivation, I won't get bored of the idea before I
| write the code.. which is what would happen a lot. Halfway
| done, get bored, then don't wanna continue.. AI fixed that
| chvid wrote:
| Slightly off topic: What are good open weight models for coding
| that run well on a macbook?
| nottorp wrote:
| Was the page done with AI? The scrolling is kinda laggy.
| Firefox/m3 pro.
| radio879 wrote:
| yeah i tried fixing it - the websites were more of an
| afterthought or annoying thing i had to do and definitely did
| it way too fast
| Weetile wrote:
| I'd love to see a thread that also takes advantage of student
| offers - for example, GitHub Copilot is free for university and
| college students
| precompute wrote:
| I only use LLMs as a substitute for stackexchange, and sometimes
| to write boilerplate code. The free chat provided by deepseek
| works very well for me, and I've never encountered any usage
| limits. V3 / R1 are mostly sufficient. When I need something
| better (not very often), I use Claude's free tier.
|
| If you really need another model / a custom interface, it's
| better to use openrouter: deposit $10 and you get 1000 free
| queries/day across all free models. That $10 will be good for a
| few months, at the very least.
| NKosmatos wrote:
| Now all we need is a wrapper/UI/manager/aggregator for all these
| "free" AI tools/pages so that we can use them without going into
| the hassle of changing tabs ;-)
| burgerone wrote:
| Why are people still drawn to using pointless AI assistants for
| everything? What time do we save by making the code quality worse
| overall?
| hoppp wrote:
| The chatgpt free tier doesn't seem to expire unlike claude or
| mistral ai, they just downgrade it to a different model
| jstummbillig wrote:
| Let's just be honest about what it is we actually do: The more
| people maximize what they can get for free, the more other people
| will have to shoulder the higher costs or limitations that
| follow. That's completely fine, not trying to pass judgement -
| but that's certainly not "free" unless you mean exactly "free for
| me, somebody else pays".
| brokegrammer wrote:
| These tricks are a little too much for me. I'd rather just write
| the code myself instead of opening 20 tabs with different LLM
| chats each.
|
| However, I'd like to mention a tool called repomix
| (https://repomix.com/), which will pack your code into a single
| file that can be fed to an LLM's web chat. I typically feed it to
| Qwen3 Coder or AI Studio with good results.
| Oras wrote:
| OP must be a master of context switching! I can't imagine opening
| that number of tabs and still focus
| funkydata wrote:
| Also, well, I mean... If there's all that time/effort
| involved... Just get yourself some tea, coffee, doodle on some
| piece of paper, do some push-ups, some yoga, prey, meditate,
| breathe and then... Code, lol!
| 3036e4 wrote:
| Maybe optimistic, but reading posts like this makes me hopeful
| that AI-assisted coding will drive people to design more modular
| and sanely organized code, to reduce the amount of context
| required for each task. Sadly pretty much all code I have worked
| with have been giant messes of everything being connected to
| everything else, causing the entire project to be potential
| context for anything.
| mathiaspoint wrote:
| LLMs will write code this way if you ask but you have to know
| to ask.
| casparvitch wrote:
| At that(/what) point does it become harder for a human to
| grok a project?
| mathiaspoint wrote:
| That's always how it works no matter how good the model is.
| I'm surprised people keep forgetting this. If _no one_ has
| the theory then the artifacts are almost unmaintainable.
|
| You can end up doing this with entirely human written code
| too. Good software devs can see it from a mile away.
| saratogacx wrote:
| It depends if you're willing to drop the $30 for the super
| version :)
| epolanski wrote:
| It does, you're essentially forced to write good coding
| guidelines and documentation.
| bongodongobob wrote:
| It's really very good at that. Frequently, I'll have something
| I've been working on over the years that has turned into an
| interconnected mess. "Split this code into modules of separated
| concerns". Bam, done. I used Claude for the first time last
| week and gave it a 2k line PowerShell script and it neatly
| pulled it apart into 5 working modules on the first try. Worked
| exactly the same, and ended up with better comments too.
| mattmanser wrote:
| So I've done that sort of refactoring a lot, albeit on real
| code in much bigger systems, not a script. Lots of coders
| won't do this, they'll just keep adding to the crap, crazy
| big module.
|
| I always end up with a vastly smaller code base. Like 2000
| lines turns into 800 lines or something like that.
|
| Did that happen too or did the AI just do a glorified
| 'extract method', that any decent IDE can already do without
| AI?
|
| I use AI, I'm not anti it, but on the other hand I keep
| seeing these gushing posts where I'm like 'but your ide could
| already do that, just click the quick refactoring button'.
| 5kyn3t wrote:
| Why is Mistral not mentioned. Is there any reason? I have the
| impression that they are often ignored by media, bloggers, devs
| when it comes to comparing or showcasing LLM thingies. Comes with
| free tier and quality is quite good. (But I am not an AI power
| user) https://chat.mistral.ai/chat
| epolanski wrote:
| Off topic but I use Mistral in production for various one shot
| tasks (mostly summarizing), it's incredibly cheap, fast and
| effective.
|
| Bonus: it's European, kinda tired of giving always money to the
| American overlords.
| sunaookami wrote:
| Becase Mistral is very bad, Qwen, Kimi and GLM are just better.
| jug wrote:
| It's not free FREE but if you deposit at least $10 on OpenRouter,
| you can use their free models without credit withdrawals. And
| those models are quite powerful, like DeepSeek R1. Sometimes,
| they are rate limited by the provider due to their popularity but
| it works in a pinch.
| PufPufPuf wrote:
| Actually nowadays they allow unlimited usage of free models
| without depositing anything.
| codeclimber wrote:
| Nice write-up, especially the point about mixing different models
| for different stages of coding. I've been tracking which IDE/CLI
| tools give free or semi-free access to pro-grade LLMs (e.g.,
| GPT-5, Claude code, Gemini 2.5 Pro) and how generous their quotas
| are. Ended up putting them side-by-side so it's easier to compare
| hours, limits, and gotchas: https://github.com/inmve/free-ai-
| coding
| Imustaskforhelp wrote:
| Ai studio using https://aistudio.google.com/ is unlimited.
|
| I also use kiro which I got access for completely free because I
| was early on seeing kiro and actually trying it out because of
| hackernews!
|
| Sometimes I use cerebras web ui to get insanely fast token
| generation of things like gpt-oss or qwen 480 b or qwen in
| general too.
|
| I want to thank hackernews for kiro! I mean, I am really grateful
| to this platform y'know. Not just for free stuff but in general
| too. Thanks :>
| scosman wrote:
| The qwen coder CLI gives you 1000 free requests per day to the
| qwen coder model (405b). Probably the best free option right now.
| faangguyindia wrote:
| Qwen cli uses whole file edit format which is slow and burns
| credits fast same is issue with gemini cli.
| indigodaddy wrote:
| Do opencode/crush also have this problem?
| gkoos wrote:
| Looks like somebody is a tad bit over reliant on these tools but
| other than that there is a lot of value in this article
| imasl42 wrote:
| You might find this repo helpful, it compares popular coding
| tools by hours with top-tier LLMs like Claude Sonnet:
| https://github.com/inmve/free-ai-coding
| matrixhelix wrote:
| https://claude.ai https://chat.z.ai https://chatgpt.com
| https://chat.qwen.ai https://chat.mistral.ai
| https://chat.deepseek.com https://gemini.google.com
| https://dashboard.cohere.com https://copilot.microsoft.com
| iLoveOncall wrote:
| This is nightmarish, whether or not you like LLMs.
|
| Just use Amazon Q Dev for free which will cover every single area
| that you need in every context that you need (IDE, CLI, etc.).
| DrSiemer wrote:
| Ha, I'm working on a similar tool:
| https://github.com/DrSiemer/codemerger
|
| Glad to see I'm not the only one who prefers to work like that. I
| don't need many different models though, the free version of
| Gemini 2.5 Pro is usually enough for me. Especially the 1.000.000
| token context length is really useful. I can just keep dumping
| full code merges in.
|
| I'll have a look at the alternatives mentioned though. Some
| questions just seem to throw certain models into logic loops.
___________________________________________________________________
(page generated 2025-08-10 23:00 UTC)