[HN Gopher] Show HN: I made an app to use local AI as daily driver
___________________________________________________________________
Show HN: I made an app to use local AI as daily driver
Hi Hackers, Excited to share a macOS app I've been working on:
https://recurse.chat/ for chatting with local AI. While it's
amazing that you can run AI models locally quite easily these days
(through llama.cpp / llamafile / ollama / llm CLI etc.), I missed
feature complete chat interfaces. Tools like LMStudio are super
powerful, but there's a learning curve to it. I'd like to hit a
middleground of simplicity and customizability for advanced users.
Here's what separates RecurseChat out from similar apps: - UX
designed for you to use local AI as a daily driver. Zero config
setup, supports multi-modal chat, chat with multiple models in the
same session, link your own gguf file. - Import ChatGPT history.
This is probably my favorite feature. Import your hundreds of
messages, search them and even continuing previous chats using
local AI offline. - Full text search. Search for hundreds of
messages and see results instantly. - Private and capable of
working completely offline. Thanks to the amazing work of
@ggerganov on llama.cpp which made this possible. If there is
anything that you wish to exist in an ideal local AI app, I'd love
to hear about it.
Author : xyc
Score : 512 points
Date : 2024-02-28 00:40 UTC (22 hours ago)
(HTM) web link (recurse.chat)
(TXT) w3m dump (recurse.chat)
| CGamesPlay wrote:
| Possibly a strange question, but do you have plans to add online
| models to the app? Local models just aren't at the same level,
| but I would certainly appreciate a consistent chat interface that
| lets me switch between GPT/Claude/local models.
| xyc wrote:
| Not strange at all! It's a very valid ask. The focus is local
| AI, but GPT-3.5/GPT-4 are actually included in the app (bring
| your own key), although customization is limited. Planning to
| expose some more customizability there including API base urls
| / model names.
| castles wrote:
| https://recurse.chat/faq/#:~:text=We%20support%20Mistral%2C%...
| christiangenco wrote:
| ...how did you highlight a specific sentence like that?
| sandyarmstrong wrote:
| Looks like a Chromium-specific feature:
| https://web.dev/articles/text-fragments
|
| Pretty cool. Doesn't work on Firefox.
| QuinnyPig wrote:
| It just worked on Safari on iOS. That's pretty
| impressive.
| svat wrote:
| https://caniuse.com/url-scroll-to-text-fragment -- yes
| Safari supports it
| iansinnott wrote:
| You could try out Prompta [1], which I made for this use case.
| Initially created to use OpenAI as a desktop app, but can use
| any compatible API including Ollama if you want local
| completions.
|
| [1]: https://github.com/iansinnott/prompta
| CGamesPlay wrote:
| This one doesn't seem to support system prompts, which are
| absolutely essential for getting useful output from LLMs.
| derwiki wrote:
| Can you speak more to this? I get useful output from LLMs
| all the time, but never use system prompts. What am I
| missing?
| CGamesPlay wrote:
| Sure, I use one system prompt template to make ChatGPT be
| more concise. Compare these two:
| https://sharegpt.com/c/fEZKMIy vs
| https://sharegpt.com/c/S2lyYON
|
| I use similar ones to get ChatGPT to be more thorough or
| diligent as well. From my limited experience with local
| models, this type of system prompting is even more
| important than with ChatGPT 4.
| addandsubtract wrote:
| Is there a difference in using a system prompt and just
| pasting the "system prompt" part at the beginning of your
| message?
| CGamesPlay wrote:
| Haven't tested, but having it built-in is more
| convenient, and convenience is why I'm using these tools
| in the first place (as a replacement for StackOverflow,
| for example).
| iansinnott wrote:
| You can update the system prompt in the settings.
| Admittedly this is not mentioned in the README, but is
| customizable.
| refulgentis wrote:
| > the system prompt
|
| There isn't a singular system prompt. It really does
| matter!
|
| Copy the OpenAI playground, you'll thank yourself later
| 8n4vidtmkvmk wrote:
| You use multiple system prompts in a single chat? What
| for?
| iansinnott wrote:
| Fair point, and it's not implemented that way currently.
| It's more like "custom instructions" but thanks for
| pointing that out. I haven't used multiple system prompts
| in the OpenAI playground either, so I hadn't given it
| much thought.
| a_bonobo wrote:
| I've run into the same problem with deploying Gemini
| locally, it does not seem to support System Prompts. I've
| cheated around this by auto-prepending the system prompt to
| the user prompt, and then deleting it from the user-
| displayed prompt again.
| longnguyen wrote:
| Shameless plug: if you need multiple AI Service Provider, give
| BoltAI[0] a try. It's native (not Electron), and supports
| multiple services: OpenAI, Azure OpenAI, OpenRouter, Mistral,
| Ollama...
|
| It also allows you to interact with LLMs via multiple different
| interfaces: Chat UI, a context-aware called AI Command and an
| Inline mode.
|
| [0]: https://boltai.com
| raajg wrote:
| looks promising, but after looking at the website I'm yearning to
| learn more about it! How does it compare to alternatives? What's
| the performance like? There isn't enough to push me to stop using
| ChatGPT and use this instead. Offline is good, but to get users
| at scale there has to be a compelling reason to shift. I don't
| think that offline capabilities are going to be enough to get
| significant number of users.
|
| Another tip, I try out a new chat interface to LLMs almost every
| week and they're free to use initially. There isn't a compelling
| reason for me to spend $10 from the get to for a use case that
| I'm not sure about yet.
| FloorEgg wrote:
| Maybe this isn't for everyone, just the people who place a high
| value on privacy.
| ukuina wrote:
| But how can I guarantee this app is private?
|
| I'm assuming I cannot block internet access to the app
| because it needs to verify App Store entitlement.
| giblfiz wrote:
| I mean, ok, then how do you distinguish yourself from LM
| Studio (Free)
| vunderba wrote:
| If your ultimate goal is privacy, then you should only be
| looking at _open source_ chat UI front ends:
|
| https://github.com/mckaywrigley/chatbot-ui
|
| https://github.com/oobabooga/text-generation-webui
|
| https://github.com/mudler/LocalAI
|
| And then connecting them to off-line models servers:
|
| - Ollama
|
| - llama.cpp
|
| And you should avoid closed source frontends:
|
| - Recurse
|
| - LM Studio
|
| And closed source models
|
| - ChatGPT
|
| - Gemini
| copperx wrote:
| Are you implying Claude is an open source model?
| bradnickel wrote:
| The compelling reason to shift to local/decentralized AI is
| that all of compute will soon be AI and that means your entire
| existence will go into it. The question you should ask yourself
| is do you want everything about you being handled by Sam
| Altman, Google, Microsoft, etc? Do you want all of your compute
| dependent on them always being up and do you want to trust
| their security team with your life? Do you want to still be
| using closed/centralized/hosted AI when truly open AI surpasses
| all of them in performance and capability. If you have children
| or family, do you want them putting their entire lives in the
| hands of those folks.
|
| Decentralized AI will eventually become p2p and swarmed and
| then the true power of agents and collaboration will soar via
| AI.
|
| Anyway, excuse the soap box, but there are zero valid reasons
| for supporting and paying centralized keepers of AI that rarely
| share, collaborate or give back to the community that made what
| they have possible.
| gverrilla wrote:
| > when truly open AI surpasses all of them in performance and
| capability.
|
| Is this true? I've tried llama last year and it was not very
| helpful. GPT4 is already full of problems and I have to keep
| circumventing them, so using something less capable doesn't
| get me too excited.
| tkgally wrote:
| For an app like this, I would really like a spoken interface. Any
| possibility of adding text-to-speech and speech-to-text so that
| users can not only type but also talk with it?
| xyc wrote:
| yes I wish it could talk. It's after other priorities though,
| but I might try something experimental.
| girishso wrote:
| I will totally pay for something like this if it answers from my
| local documents, bookmarks, browser history etc.
| xyc wrote:
| Yes it would be the next big focus on this. Personal data
| connectivity is what I see where local AI would excel - despite
| model power differences.
| chaostheory wrote:
| Yeah, we're getting closer to "Her"
| _boffin_ wrote:
| Good to know there's a market for that. Currently building
| out something. Integrating from numerous sources, processing
| and then utilizing those.
|
| nice.
| ssnri wrote:
| I would even let it have longer processing times for queries
| to apply against each document in my system, allow it to
| specialize/train itself on a daily basis...
|
| Use all the resources you want if you save me brainpower
| xyc wrote:
| Agree, there's a non real-time angle to this.
| samstave wrote:
| "give me a summary of the news around this topic each
| morning for my daily read"
|
| Help me plan for upcoming meetings whereby if I put
| something in calendar, it will build a little dossier for
| the event, and include relevant info based on the type of
| event or meeting, mostly scheduling reminders or
| prompting you with updates or changes to the event etc.
| ssnri wrote:
| "filter out baby pictures from my family text threads"
| Satam wrote:
| I have doubts about that. Most personal data actually lives
| in the cloud these days. If you need your Gmail emails,
| you'll need to use their API which is guarded behind $50k
| certification fee or so. I think there is a simpler version
| for personal use, but you still need to get the API key.
| Who's going to teach their mom about API keys? So I think for
| a lot of these data sources you'll end up with enterprise AIs
| integrating them first for a seamless experience.
| xyc wrote:
| I think this is a good take. While there's big enough niche
| for personal data locally, I'd love if there's a way to
| solve for email/cloud data requiring API keys.
| noduerme wrote:
| Ideally, though, a sufficiently smart LLM shouldn't need
| API access. It could navigate to your social media login
| page, supply your credentials, and scrape what it sees.
| Better yet, it should just reverse-engineer the API ;)
| coev wrote:
| Why wouldn't you be able to use IMAP over the gmail api?
| IMAP returns the text and headers of all your emails, which
| is what you'd want the LLM to ingest anyway.
| noduerme wrote:
| Seconding a sibling question: What $50k API fee? To access
| your gmail? I've been using gmail since 2008 or so without
| ever touching their web/app interface or getting an API
| key. You just use it as an IMAP server.
| Satam wrote:
| To use Google's sensitive APIs in production you have to
| certify your product and that costs tens of thousands. To
| be honest, didn't think about imap at first, but it looks
| like that could be getting tougher soon too
| https://support.google.com/a/answer/14114704?hl=en. Soon
| they will require oAuth for imap and with oAuth you'll
| need the certification: https://developers.google.com/gma
| il/imap/xoauth2-protocol. If it's for personal use, you
| might be able to get by with just with some warnings in
| the login flow but it won't be easy to get oAuth flow
| setup in the first place.
| noduerme wrote:
| Yeah, Thunderbird integrated oAuth in the last few
| releases, mainly to keep up with the Gmail and Hotmail
| requirements. Made it very user-friendly to set up in the
| GUI right within T-bird. I don't see this being a major
| obstacle.
|
| I'm not sure I can imagine a scenario in production where
| Google would, or should, allow API access to individual
| gmail accounts. What's that for? So you can read all your
| employees' mail without running your own email server?
| samstave wrote:
| What?
|
| I manage both gmail and protonmail via thunderbird - where
| I have better search and sort using IMAP.
| chb wrote:
| This. There was a post in HN last week, iirc, referring to just
| such a solution called ZenFetch (?). I would have adopted it in
| a heartbeat but they don't currently have a means of exporting
| the source data you feed to it (should you elect it as your
| sole means of bookmarking, etc)
| gabev wrote:
| Hey there,
|
| This is Gabe, the founder of Zenfetch. Thanks for sharing.
| We're putting together an export option where you can
| download all your saved data as a CSV and should get that out
| by end of week.
| samstave wrote:
| Seems like this would be a good tool to build lessons on -
| if you could share a "class" and export a link for others
| to then copy the class, and expand on the
| lesson/class/topic into their own AI. but as a separate
| "class" and not fully integrated to my regular history
| blob?
|
| I want the ability to search all my downloaded files and
| organize them based on context within. Have it create a
| category table, and allow me to "put all pics of my cat in
| this folder, and upload them to a gallery on imgur."
| gabev wrote:
| We're working on the ability to share folders of your
| knowledge so that others can search/chat across them.
|
| We've been thinking of this as a "subscription" to the
| creator's folder. Similar to how you might subscribe to a
| Spotify playlist
| samstave wrote:
| Or aN RSS?
| scottrblock wrote:
| plus one, I would love to configure a folder of markdown/txt(+
| eventually images and pdfs) files that this can have access to.
| Ideally it could RAG over them in a sensible way. Would love to
| help support this!
| xyc wrote:
| Thank you! I'd love to learn more about your use cases. Would
| you mind sending an email to feedback@recurse.chat or DM me
| on https://x.com/chxy to get the conversation started?
| jlund-molfese wrote:
| Sounds like https://www.rewind.ai/ ?
| toomuchtodo wrote:
| https://news.ycombinator.com/item?id=38787892 ("Show HN: Rem:
| Remember Everything (open source)") ?
|
| https://github.com/jasonjmcghee/rem
| vunderba wrote:
| There are already several RAG chat open source solutions
| available. Two that immediately come to mind are:
|
| Danswer
|
| https://github.com/danswer-ai/danswer
|
| Khoj
|
| https://github.com/khoj-ai/khoj
| wkat4242 wrote:
| Stupid question but what does RAG stand for?
| onehp wrote:
| Retrieval augmented generation. In short you use an LLM to
| classify your documents (or chunks from them) up front.
| Then when you want to ask the LLM a question you pull the
| most relevant ones back to feed it as additional context.
| danielovichdk wrote:
| I dont get it. To my understanding it takes huge amounts
| of data to build any any form of RAG. Simply because it
| enlarges the statistical model you later prompt. If the
| model is not big enough how would you expect it to answer
| you in a non qualifying matter ? It simply can't.
|
| So I don't really buy it and I have yet to see it work
| better than any rdbms search index.
|
| Tell me I am wrong, I would like to see a local model
| based on my own docs being able to answer me quality
| answers based on quality prompts.
| tveita wrote:
| RAG doesn't require much data or involve any training, it
| is a fancy name for "automatically paste some relevant
| context into the prompt"
|
| Basically if you have a database of three emails and ask
| when Biff wanted to meet for lunch, a RAG system would
| select the most relevant email based on any kind of
| search - embeddings are most fashionable, and create a
| prompt like
|
| """Given this document: <your email>, answer the question
| "When does Biff want to meet for lunch?"""
| loudmax wrote:
| That's not how RAG works. What you're describing is
| something closer to prompt optimization.
|
| Sibling comment from discordance has a more accurate
| description of RAG. There's a longer description from
| Nvidia here: https://blogs.nvidia.com/blog/what-is-
| retrieval-augmented-ge...
| tveita wrote:
| Right, you read something nebulous about how "the LLM
| combines the retrieved words and its own response to the
| query into a final answer it presents to the user", and
| you think there is some magic going on, and then you
| click one link deeper and read at
| https://ai.meta.com/blog/retrieval-augmented-generation-
| stre... :
|
| > Given the prompt "When did the first mammal appear on
| Earth?" for instance, RAG might surface documents for
| "Mammal," "History of Earth," and "Evolution of Mammals."
| These supporting documents are then concatenated as
| context with the original input and fed to the [...]
| model
|
| Finding the relevant context to put in the prompt is a
| search problem, nearest neighbour search on embeddings is
| one basic way to do it but the singular focus on "vector
| databases" is a bit of hype phenomenon IMO - a real world
| product should factor a lot more than just pure textual
| content into the relevancy score. Or is your personal AI
| assistant going to treat emails from yesterday as equally
| relevant as emails from a year ago?
| machiaweliczny wrote:
| Legit explanation, that's how it works AFAIK.
| discordance wrote:
| RAG:
|
| 1. First you create embeddings from your documents
|
| 2. Store that in a vector db
|
| 3. Ask what the user wants and do a search in the vector
| db (cosine similarity etc)
|
| 4. Feed the relevant search results to your LLM and do
| the usual LLM stuff with the returned embeddings and
| chunks of the documents
| bigfudge wrote:
| Although RAG is often implemented via vector databases to
| find 'relevant' content, I'm not sure that's a necessary
| component. I've been doing what I call RAG by finding
| 'relevant' content for the current prompt context via a
| number of different algorithms that don't use vectors.
|
| Would you define RAG only as 'prompt optimisation that
| involves embeddings'?
| eevmanu wrote:
| Sure thing, your RAG approach sounds intriguing,
| especially since you're sidestepping vector databases.
| But doesn't the input context length cap affect it?
| (chatgpt plus at _32K_ [0] or gpt4 via open ai at _128K_
| [1]) Seems like those cases would be pretty rare though.
|
| [0]:
| https://openai.com/chatgpt/pricing#:~:text=8K-,32K,-32K
|
| [1]: https://platform.openai.com/docs/models/gpt-4-and-
| gpt-4-turb...
| spiderfarmer wrote:
| Next version of MacOS will probably have that.
| tethys wrote:
| As long as you use Safari for browsing, Notes for note
| taking, iCloud for mail ...
| rbtprograms wrote:
| Looks great! Does it support different sized models, i.e. can I
| run llama 70B and 7B, and is there a way to specify which model
| to chat with? Are there plans to allow users to ingest their own
| models through this UI?
| xyc wrote:
| If you have a gguf file you can link it. For ingesting new
| models - I'm thinking about adding some CRUD UIs to it, but I'd
| like to keep a very small set of default models.
| rbtprograms wrote:
| thanks, its a great project
| 3abiton wrote:
| How different is this compared to Jan.ai for example?
| xyc wrote:
| as i understand jan.ai is more focused on enterprise /
| platform, while I'd see where recursechat would go is more like
| "obsidian.md" but as your personal AI.
| gexla wrote:
| Obsidian has add-ons which do much of this.
| internetter wrote:
| People are treating Obsidian like it's the next Emacs
| rexreed wrote:
| What are the MacOS and hardware requirements? How does it perform
| on a slightly older model, lower powered Mac? I wish I could test
| this to see how it would perform, and while it's only $10, I
| don't want to spend that just to realize it won't work on my
| older, underpowered Mac mini.
| xyc wrote:
| Good question, I'll put some system requirements on the
| website. It only supports mac with Apple Silicon now, if that's
| helpful.
| pantulis wrote:
| Instant buy, great work and the price point is exactly right.
| Good luck!
| xyc wrote:
| Appreciate your support. Thank you so much!
| pentagrama wrote:
| Congrats! Plans on Windows support?
| xyc wrote:
| Thanks! Sorry no immediate plan. People have recommended Chat
| with RTX so it might be worth checking out.
| https://www.nvidia.com/en-us/ai-on-rtx/chat-with-rtx-generat...
| appel wrote:
| It looks amazing, OP! I'm sad I'm missing out as a Windows
| user.
| theolivenbaum wrote:
| You can try https://curiosity.ai, supports Windows and macOS
| giblfiz wrote:
| So there are a few questions that leap out at me:
| * What are you using for image generation? Is that local as well
| (stable diffusion?) Does it have integrated prompt generation?
| * You mention the ability to import ChatGPT history, are you able
| to import other documents? * How many "agent" style
| capacities does it have? Can it search the web? use other APIs?
| Prompt itself? * Does it have a plugin framework? you
| mention that it is "customizable" but that can mean almost
| anything. * What is the license? what assurances do
| users have that their usage is private? I mean, we all know how
| many "local" apps exfiltrate a ton of data.
| hanniabu wrote:
| > What are you using for image generation?
|
| It doesn't look like it supports image generation
| unfortunately. If it did then I would definitely adopt this as
| my daily driver.
| android521 wrote:
| how big is the local model? what is the Mac spec requirement? I
| don't want to download and find out it won't work in my computer.
| It seems like the first question everyone would ask and should be
| addressed on the website.
| visarga wrote:
| It uses ollama which is based on llama.cpp, and adds a model
| library with dozens of models in all quant sizes.
| xyc wrote:
| no this doesn't use ollama, just based on llama.cpp.
| xyc wrote:
| Appreciate the feedback! It works on mac with Apple Silicon
| only. I'll put some system requirements on the website.
| pentagrama wrote:
| Sadly I can't try this because I'm on Windows or Linux.
|
| Was testing apps like this if anyone is interested:
|
| Best / Easy to use:
|
| - https://lmstudio.ai
|
| - https://msty.app
|
| - https://jan.ai
|
| More complex / Unpolished UI:
|
| - https://gpt4all.io
|
| - https://pinokio.computer
|
| - https://www.nvidia.com/en-us/ai-on-rtx/chat-with-rtx-generat...
|
| - https://github.com/LostRuins/koboldcpp
|
| Misc:
|
| - https://faraday.dev (AI Characters):
|
| No UI / Command line (not for me):
|
| - https://ollama.com
|
| - https://privategpt.dev
|
| - https://serge.chat
|
| - https://github.com/Mozilla-Ocho/llamafile
|
| Pending to check:
|
| - https://recurse.chat
|
| Feel free to recommend more!
| chown wrote:
| I am the author of Msty app mentioned here. So humbled to see
| an app that is just about a month old that I mostly wrote for
| my wife and some friends to begin with (who got overwhelmed
| with everything that was going in LLM world), on the top of
| your list. Thank you!
| Datagenerator wrote:
| Looks interesting, but can't see what it is doing. Any link
| to the source code?
| petemir wrote:
| If you need help for testing the Linux version let me know,
| I'd be happy to help
| chown wrote:
| I was actually looking for one! What's the best way to
| reach you? Mind jumping on our Discord so that I can share
| the installer with you soon?
| crooked-v wrote:
| One bit of feedback: there's nowhere to put system messages.
| These can be much more influential than user prompts when it
| comes to shaping the tone and style of the response.
| chown wrote:
| That's on the top of our list. It got pushed back because
| we want to support creating a character/profile (basically
| select a model and apply some defaults including a system
| prompt). But I feel like that was a mistake tomwait for it.
| Regardless, it is getting added in the next release (the
| one after something that is dropping in a day or 2, which
| is a big release in itself)
| hanniabu wrote:
| 1) What are the mac system requirements? Does it need a
| specific OS version?
|
| 2) If you're privacy first, many would feel a lot more
| comfortable if this was released as an app in the app store
| so it will be sandboxed. This is important because it's not
| open source so we have no idea what is happening in the
| background. Alternatively open source it, which many here
| have requested.
| lolpanda wrote:
| Oh thanks! didn't know there are quite a few ChatGPT local
| alternatives. I was wondering what users they are targeting.
| Engineers or average users? I guess average users will likely
| choose ChatGPT and Perplexity over local apps for more recent
| knowledge of the world.
| chown wrote:
| Hi. I'm the author of Msty app, 2nd on the list above. You
| are right about average users likely choosing ChatGPT over
| local models. My wife was the first and the biggest user of
| my app. A software engineer by profession and training but
| she likes to not worry about LLM world and just to use it as
| a tool that makes you more productive. As soon as she took
| Msty for a ride, I realized that some users, despite their
| background, care about online models. This actually led me
| adding support for online models right away. However, she
| really likes to make use of the parallel chat feature and
| uses both Mistral and ChatGPT models to give same prompt and
| then compare the output and choose the best answer (or
| sometimes make a hybrid choice). She says that being able to
| compare multiple outputs like that is a tremendously helpful.
| But that's the extent of local LLMs for her. So far my effort
| has been to target a bit higher than the average users while
| making it approachable for more advanced users as well.
| Gunnerhead wrote:
| I'm looking for a ChatGPT client alternative, i.e. I can
| use my own OpenAI API key in some other client.
|
| Offline isn't important for me, only that $20 is a lot of
| money, when I'd wager most months my usage is a lot less.
| However, I'd still want access to completion, DALL-E, etc.
|
| Would Msty be a good option for me?
| chown wrote:
| Give it a try and see how you feel. "Yes, it will" be a
| dishonest answer to be completely honest at least at this
| point. The app has been out for just about a month and I
| am still working in it. I would love a user like you to
| give it a try and give me some feedback (please). I am
| very active on our Discord if you want to get in touch
| (just mention your HN username and I will wace).
| Gunnerhead wrote:
| Thank you so much, I'm excited to give this a try in the
| next few days.
| AriedK wrote:
| Looks great, though the fact that you have to ignore your
| anti-virus warning during installation, and the fact that
| it phones home (to insights.msty.app) directly after launch
| despite the line in the FAQ on not collecting any data
| makes me a little skittish.
| stlhood wrote:
| Just FYI, llamafile includes a web-based chat UI. It fires up
| automatically.
| joshmarinacci wrote:
| Do any of these let you dump in a bunch of your own documents
| to use as a corpus and then query and summarize them ?
| windexh8er wrote:
| Yes, GPT4All has RAG-like features. Basically you configure
| some directories and then have it load docs from whatever
| folders you have enabled for the model you're currently
| using. I haven't used it a ton, but I have used it to review
| long documents and it's worked well depending on the model.
| chown wrote:
| Author of Msty here. Not yet but I am already working on the
| design for it to be added in very near future. I am happy to
| chat more with you to understand your needs and what you are
| looking in such apps. Please hop on the Discord if you don't
| mind :)
| hanniabu wrote:
| Some of my usecases would be summarizing a PDF report,
| analyzing json/csv data, upload a dev project to write a
| function or feature or build a UI, rename image files,
| categorize images, etc
| 8n4vidtmkvmk wrote:
| The new one straight from Nvidia does I believe.
| Datagenerator wrote:
| Open-WebUI has support for doing that, it works using #tags
| for each document so you can ask questions about multiple
| specific documents.
| greggsy wrote:
| https://github.com/imartinez/privateGPT
| visarga wrote:
| Add Open-WebUI (used to be Ollama-WebUI)
|
| https://github.com/open-webui/open-webui
|
| a well featured UI with very active team
| theolivenbaum wrote:
| We just added local LLM support to our curiosity.ai app too -
| if anyone wants to try we're looking for feedback there!
| hmdai wrote:
| Try this one: https://uneven-macaw-bef2.hiku.app/app/
|
| It loads the LLM in the browser, using webgpu, so it works
| offline after the first load, it's also PWA you can install. It
| should work on chrome > 113 on desktop and chrome > 121 on
| mobile.
| wanderingmind wrote:
| lmstudio is using a dark pattern I really hate. Don't have a
| Github logo in your webpage if your software is not source
| available. It just takes to Github to some random config repos
| they have. This is poor choice in my opinion.
| Hugsun wrote:
| We call that stolen valor.
| woadwarrior01 wrote:
| Since I couldn't find it in your list, I'd like to plug my own
| macOS (and iOS) app: Private LLM. Unlike almost every other app
| in the space, it isn't based on llama.cpp (we use mlc-llm) or
| naive RTN quantized models (we use OmniQuant). Also, the app
| has deep integrations with macOS and iOS (Shortcuts, Siri,
| macOS Services, etc).
|
| Incidentally, it currently runs Mixtral 8x7B Instruct[2] and
| Mistral[3] models faster than any other macOS app. The
| comparison videos are with Ollama, but it generalizes well to
| almost every other macOS app that I've seen uses llama.cpp for
| inference. :)
|
| nb: Mixtral 8x7B Instruct requires an Apple Silicon Mac with at
| least 32GB of RAM.
|
| [1]: https://privatellm.app/
|
| [2]: https://www.youtube.com/watch?v=CdbxM3rkxtc
|
| [3]: https://www.youtube.com/watch?v=UIKOjE9NJU4
| sigmoid10 wrote:
| What's the performance like in tokens/s?
| woadwarrior01 wrote:
| You can see ms/token in a tiny font on the top of the
| screen, once the text generation completes in both the
| videos I'd linked to. Performance will vary by machine. On
| my 64GB M2 Mac Studio Max, I get ~47 tokens/s
| (21.06ms/token) with Mistral Instruct v0.2 and ~33 tokens/s
| (30.14ms/token) with Mixtral Instruct v0.1.
| castles wrote:
| Interesting! What's the prompt eval processing speed like
| compared to llama.cpp and kin?
| woadwarrior01 wrote:
| I haven't run any specific low level benchmarks, lately.
| But chunked prefilling and tvm auto-tuned Metal kernels
| from mlc-llm seemed to make a big differenced, the last
| time I checked. Also, compared to stock mlc-llm, I use a
| newer version of metal (3.0) and have a few modifications
| to make models have a slightly smaller memory and disk
| footprint, also slightly faster execution. Because unlike
| the mlc-llm folks, I only care about compatibility with
| Apple platforms. They support so much more than that in
| their upstream project.
| castles wrote:
| thanks, I'll give it a crack
| iknowstuff wrote:
| MacGPT is way handy because of a global keyboard shortcut
| which opens a spotlight-like prompt. I would love to have a
| local equivalent
| vorticalbox wrote:
| have you seen llamafile[0]?
|
| [0] https://github.com/Mozilla-Ocho/llamafile
| greggsy wrote:
| Khoj was one of the first 'low-touch' solutions out there I
| think. It's ok, but still under active development, like all of
| them really.
|
| https://khoj.dev/
| quickthrower2 wrote:
| Thanks for the list. Tried Jan just now as it is both easy and
| open source. It is a bit buggy I think but the concept is ace.
| The quick install, tells you which models work on your machine,
| one click download and then a chatgpt style interface. Mistral
| 7B running on my low spec laptop at 6 token/s making some damn
| sense is amazing. The bugs are at the inference time. Could be
| hardware issues though, not sure. YMMV
| smnscu wrote:
| Nice, adding these to my list. Here's a list that I put
| together, it has active GitHub projects for LLM UIs, ordered by
| stars:
|
| - https://github.com/nomic-ai/gpt4all
|
| - https://github.com/imartinez/privateGPT
|
| - https://github.com/oobabooga/text-generation-webui
|
| - https://github.com/FlowiseAI/Flowise
|
| - https://github.com/lobehub/lobe-chat
|
| - https://github.com/PromtEngineer/localGPT
|
| - https://github.com/h2oai/h2ogpt
|
| - https://github.com/huggingface/chat-ui
|
| - https://github.com/SillyTavern/SillyTavern
|
| - https://github.com/ollama-webui/ollama-webui
|
| - https://github.com/Chainlit/chainlit
|
| - https://github.com/LostRuins/koboldcpp
|
| - https://github.com/ParisNeo/lollms-webui/
| chaxor wrote:
| What about https://github.com/open-webui/open-webui ?
|
| Seems to have more features than all of them
| bradnickel wrote:
| Love this! Just purchased. I am constantly harping on
| decentralized AI and love seeing power in simplicity.
|
| Are you on Twitter, Threads, Farcast? Would like to tag you when
| I add you to my decentralized AI threads.
| xyc wrote:
| Thank you so much for the support! Simplicity is power indeed.
| I'm on twitter: https://x.com/chxy
| bradnickel wrote:
| Found your Twitter account in a previous post. Just tagged you.
| xyc wrote:
| Awesome, thanks for the tag!
| hanniabu wrote:
| What's your farcaster?
| xyc wrote:
| Wow, I did not expect at all this will end up on the front page.
| Thank you for all the enthusiasm, I'll try to get to more
| questions later today but if there's something I missed my
| X/twitter DM is open: https://x.com/chxy
| castles wrote:
| It seems "local" is all you need :)
| sen wrote:
| This is awesome. I currently use Ollama with OpenWebUI but am a
| big fan of native apps so this is right up my alley.
| xyc wrote:
| Thank you!
| woadwarrior01 wrote:
| It looks like an Electron app, and not a native app.
|
| https://imgur.com/a/pz0kzJ1
| toomuchtodo wrote:
| Hey! This is awesome! How hard would it be to plug it into
| something like Raindrop.io (bookmark manager) to train on all
| bookmarks collected?
| xyc wrote:
| haven't tried Raindrop.io, looks neat! Saw some other posts
| mentioning bookmarks as well. I'll keep this in thought, but
| will have to try it out first to find out.
| toomuchtodo wrote:
| Appreciate it, thank you.
| cooper_ganglia wrote:
| I read the website for 30 seconds and instantly bought it.
|
| It's clean, easy to use, and works really well! Easy local server
| hosting was cool, too. I've used the other LLM apps, and this
| feels like those, but simplified. It just feels good to use. I
| like it a lot!
|
| I'm gonna test drive it for a while, and if I keep using it
| regularly, I'll definitely be sending in some feedback. Other
| users have made a lot of really great recommendations already,
| I'm excited to see how this evolves!
| xyc wrote:
| Thanks so much for the kind words and giving it a spin!
|
| Feel free to send feedback, issues, feature suggestion as you
| use it more, I'm all ears. My twitter DM is also open:
| https://x.com/chxy.
| madduci wrote:
| Any chance to see it available on other operating systems as
| well?
| xyc wrote:
| Unfortunately not now. If you are interested in email
| updates: https://tally.so/r/wzDvLM
| devinprater wrote:
| There's another one someone made for blind users like themselves
| and me, called Vollama (they use a mac, so VoiceOver + Llama).
| It's really good. I haven't tested many others for accessibility,
| but it has RAG and uses Ollama as backend, so works very well for
| me.
|
| https://github.com/chigkim/VOLlama/
| chown wrote:
| It's very nice that there exists something like that. I am an
| author of one of the similar apps [1] someone listed in a
| different thread. I was hoping I could get in touch with
| someone like you who could give me some feedback on how to make
| my app more accessible for users like you. I really want to it
| be an "LLM for all" kind of app but despite my best efforts and
| intention, I suck at it. Any chance of getting in touch with
| you and get some feedback? Only if you want and have time, no
| pressure at all.
|
| [1] https://msty.app
| devinprater wrote:
| Sure, I'll probably join the discord tomorrow morning, but a
| few notes:
|
| * For apps like this, using live regions to speak updates may
| be helpful. either that or change the buttons, like from
| "download local AI" to "configuring." Maybe a live region
| would be best for that one since sighted people would
| probably be looking near the bottom for the status bar, but
| anyway... * Using live regions for chats is pretty important,
| because otherwise we don't know when a message is ready to
| read, and it makes reading those messages much simpler. The
| user types the message, presses Enter, and the screen reader
| reads the message to them. So, making a live region, and then
| sending the finished message, or a finished part of a
| message, to that live region would be really helpful. * Now
| on to the UI. At the top, we have "index /text-chat-
| sessions". I guess that should just say "chats"? Below that,
| we have a list, with a button saying the same thing. After
| that list with one item, is a button that says "index /local-
| ai". That should probably just be "local AI". Afterwards,
| there is "index /settings", which should just be "settings."
| Then, there is an unlabeled button. I'm guessing this is
| styled to look like a menu bar, across the top of the window,
| so it'd be the item on the right side. Now, there's a button
| below that that says "New Chat^N". I, being a technical user,
| am pretty sure the "^N" means "Control + N", but almost no
| one else knows that. So, maybe change that text label.
| Between that and the Recent Chats menu button are two
| unlabeled buttons. I'm not sure why a region landmark was
| used for the recent chats list, but after the chat name
| "hello" in this case, where I can rename the chat, there is
| an unlabeled button. The button after the model chooser is
| unlabeled as well. After the user input in the conversation,
| there are three unlabeled buttons. After the response, there
| is a menu button with (oh, that's cool) items to transform
| the response into bullets, a table, ETC. but that menu button
| was unlabeled so I had to open it to see what's inside. After
| that, all other buttons, like for adding instructions to
| refine this message, are also unlabeled.
|
| So, live regions for speaking chat messages and state changes
| like "loading" or "ready" or whatever (keep them short), and
| label controls, and you should be good to go.
|
| Live regions: https://developer.mozilla.org/en-
| US/docs/Web/Accessibility/A...
| chown wrote:
| Wow! This is already very helpful and was the kind of
| feedback I was looking for. Thank you!
| indit wrote:
| Hi, I just use msty. Could it use an already downloaded gguf
| file?
| chown wrote:
| Not right now but that's something we plan to support soon.
| Supporting Ollama downloaded models is getting released
| either today or tomorrow, gguf support might go into the
| next release. Would love to chat with you to learn more
| about your use case. Mind saying hi on our Discord?
| karolist wrote:
| Hey. I'm sorry about your condition. I feel I'm approaching
| blindness eventually, this is very random, but perhaps you
| could share any resources I could learn to prepare for this so
| I could continue using the web when/if it happens.
| devinprater wrote:
| I'll try. To get things started, if you have an iPhone, check
| out AppleVis:
|
| https://applevis.com/
|
| If you have Android:
|
| https://blindandroidusers.com/
|
| I believe Hadley is still a good resource:
| https://hadleyhelps.org/welcome-hadley
|
| I hope this helps get you started.
| SkepticMystic wrote:
| I've found great utility with `llm` https://llm.datasette.io, a
| CLI to interact with LLMs. It has plugins for remote and local
| models.
| xyc wrote:
| Good to know. I've learned lots of things from Simon Willison's
| blog (datasette's author), so can't imagine llm being unuseful.
| geniium wrote:
| I am very glad to see that kind of app. Well done!
| rkuodys wrote:
| Honest question - can it be used for programming? Or anyone maybe
| can recommend local-first development LLM which would take in all
| project (Python / Angular) and write code based on full repo, not
| only the active window as with Copilot or Jetbrains AI
| arzke wrote:
| Have you tried using Copilot's @workspace command in the chat?
| _ink_ wrote:
| Check out the continue dev plugin (available for VS Code and
| Jetbrains). You can attach it to OpenAI or local models and it
| can consider files in your codebase. It has a @Codebase
| keyword, but so far I get better results in specifically
| pointing to the needed files.
| surrTurr wrote:
| any plans on supporting ollama integration?
| code51 wrote:
| Thank you for the work.
|
| Please take this in a nice way: I can't see why I would use this
| over ChatbotUI+Ollama https://github.com/mckaywrigley/chatbot-ui
|
| Seem the only advantage is having it as MacOS native app and only
| real distinction is maybe fast import and search - I've yet to
| try that though.
|
| ChatbotUI (and other similar stuff) are cross-platform,
| customizable, private, debuggable. I'm easily able to see what
| it's trying to do.
| ayhoung wrote:
| Not everyone is a dev
| Alifatisk wrote:
| HN users keep forgetting that
| vood wrote:
| Thanks for sharing ChatbotUI. While I'm not an author, I use it
| extensively and contribute to it. Thanks to the permissive
| license, I could offer ChatbotUI as a hosted solution with our
| API keys. https://labs.writingmate.ai.
| 911e wrote:
| Not a bit of open code while I'm 100% sure they use some that
| require it. If you"re using AI + Your data without insight on how
| it's used you're a fool. 2 cents
| tartrate wrote:
| > Full Text Search. Blazingly fast search over thousands of
| messages.
|
| Natural language processing has come full circle and just
| reinvented Ctrl+F.
|
| I had to double check that a regular '90s search function was
| actually the thing being advertised here, and sure enough, there
| is a gif demonstrating exactly that.
| addandsubtract wrote:
| Ctrl+F only gets you so far. It doesn't allow you to perform
| semantic searches, for example. If you don't happen to know a
| unique word (or set of words) to search for, you're out of
| luck.
|
| Just the other day, I was able to find a song by typing the
| phonetic pronunciation (well, as best I could) into ChatGPT,
| and it knew which song I was talking about right away. No way a
| regular search engine would've helped me there.
| danielovichdk wrote:
| No. Your own data only gets you so far. And this is exactly
| the issue. No local model will make sense because the dataset
| its given is so small compared to what you are referring to -
| chatgpt.
|
| It's useless locally.
| behnamoh wrote:
| and yet ChatGPT doesn't support it.
| davely wrote:
| Yeah, I think the call out here is specifically because you the
| ChatGPT interface doesn't have a search feature (on web).
| Interestingly, on their iOS app, you can search.
|
| I often find myself opening the app on my phone if I want to
| find a previous conversation, even if I'm at my desk.
| ggerganov wrote:
| > Thanks to the amazing work of @ggerganov on llama.cpp which
| made this possible. If there is anything that you wish to exist
| in an ideal local AI app, I'd love to hear about it.
|
| The app looks great! Likewise, if you have any requests or ideas
| for improving llama.cpp, please don't hesitate to open an issue /
| discussion in the repo
| petargyurov wrote:
| Did not expect to see _the_ Georgi Gerganov here :) How is GGML
| going?
|
| Pozdravi!
| ggerganov wrote:
| So far is going great! Good community, having fun. Many ideas
| to explore :-)
| xyc wrote:
| Oh wow it's the goat himself, love how your work has
| democratized AI. Thanks so much for the encouragement. I'm
| mostly a UI/app engineer, total beginner when it comes to
| llama.cpp, would love to learn more and help along the way.
| duckkg5 wrote:
| Nothing to add except that your work is tremendous
| titaniumtown wrote:
| Wow I've been following your work for a while, incredible
| stuff! Keep up the hard work, I check llama.cpp's commits and
| PRs very frequently and always see something interesting in the
| works (the alternative quantization methods and Flash Attention
| have been interesting).
| jiriro wrote:
| Out of curiosity - how is this app built?:-)
|
| There is a demo clip with a vertical scroll bar which does not
| fade out as it would do in a native mac app:)
| rangera wrote:
| Scroll bars don't fade out if you're using a mouse (as opposed
| to just a trackpad) or if you've set Mac OS Settings >
| Appearance > Show scroll bars to "Always".
| jiriro wrote:
| I see! I've not used mouse on a mac:-o
|
| Anyway the UI looks not mac native. I'm interested what it
| is:-)
| Alifatisk wrote:
| Yeah I am curious what the app is built with. I saw someone
| mention it's using Electron, so that's a start.
| SushiHippie wrote:
| Even with a screenshot
|
| https://news.ycombinator.com/item?id=39535755
| zzz999 wrote:
| Any censorship?
|
| (Can't try MacOS Apps)
| famahar wrote:
| Will this work on an M1 Mac Book Air? Looking for an offline
| solution like this but wary of hardware requirements.
| konschubert wrote:
| I want something that starts as a simple manager for my
| reminders, something that tells me what to do next. And then, as
| features are being added, grows into a full-blown personal
| assistant that can book flights for me.
| stuckkeys wrote:
| No iPhone app? Assuming it looks to connect to a local server or
| are you actually downloading the llms local to the device?
| domano wrote:
| Hey, i bought it, nice work!
|
| A few things:
|
| * The main thing that makes ChatGPTs ui useful to me is the
| ability to change any of my prompts in the conversation & it will
| then go back to that part of the converation and regenerate,
| while removing the rest of the conversation after that point.
|
| Such a chat ui is not usable for me without this feature.
|
| * The feedback button does nothing for me, just changes focus to
| chrome.
|
| * The LLaVA model tells me that it can not generate images since
| it is a text based AI model. My prompts were "Generate an image
| of ..."
| wodow wrote:
| > * The main thing that makes ChatGPTs ui useful to me is the
| ability to change any of my prompts in the conversation & it
| will then go back to that part of the converation and
| regenerate, while removing the rest of the conversation after
| that point.
|
| Agreed, but what I would _also_ really like (from this and
| ChatGPT) would be branching: take a conversation in two
| different ways from some point and retain the seperate and
| shared history.
|
| I'm not sure what the UI should be. Threads? (like mail or
| Usenet)
| shanusmagnus wrote:
| 1000 upvotes for you. My brain can't compute why someone
| hasn't made this, along with embeddings-based search that
| doesn't suck.
| FredPret wrote:
| I bet UI and UX innovation will follow, but model quality
| is the most important thing.
|
| If I were OpenAI, I would 95% of resources on ChatGPT5, and
| 5% into UX.
|
| Once the dust settles, if humanity still exists, and human
| customers are still economically relevant, AI companies
| will shift more resources to UX.
| rhaps0dy wrote:
| They did make it, in 2021.
| https://generative.ink/posts/loom-interface-to-the-
| multivers... (click through to the GitHub repo and check
| the commit history, the bulk of commits is at least 3 years
| old)
| ItsMattyG wrote:
| ChatGPT does this. You just click an arrow and it will show
| you other branches.
| ApolloFortyNine wrote:
| I have ChatGPT4, I have no idea what arrow you are talking
| about. Could you be more specific? I see now arrow on any
| of my previous messages or current ones.
| wodow wrote:
| By George, ItsMattyG is right! After editing a question
| (with the "stylus"/pen icon), the revision number counter
| that appears (e.g. "1 / 2") has arrows next to it that
| allow forward and backward navigation through the new
| branches.
|
| This was surprisingly undiscoverable. I wonder if it's
| documented. I couldn't find anything from a quick look at
| help.openai.com .
| xyc wrote:
| Nice suggestion! Threading / branching won't be too crazy to
| support. I'll explore ChatGPT style branch or threads and see
| what'll work better.
| pps wrote:
| > The LLaVA model tells me that it can not generate images
| since it is a text based AI model.
|
| Because it can't generate images, it can only describe images
| provided by the user.
| xyc wrote:
| Thank you for the support and the valuable feedback! Sorry
| about the response time, I haven't expected the incoming volume
| of requests.
|
| * For changing prompt in the middle - I'll take a crack at it
| this week. It's on top of my post launch list.
|
| * Feedback button: Thanks for reporting this. The button was
| supposed to open default email client to email
| feedback@recurse.chat
|
| * LLaVA model: I'll add more documentation. You are right Llava
| could not generate images. It can only describe images (similar
| to GPT-4v). For image generation, it's not supported in the
| app. While I don't have immediate plans for image generation,
| check out these projects for local image generation.
|
| - https://diffusionbee.com/
|
| - https://github.com/comfyanonymous/ComfyUI
|
| - https://github.com/AUTOMATIC1111/stable-diffusion-webui
| bberenberg wrote:
| There are a lot of tools listed in this thread, but I am not
| seeing the thing I want which is:
|
| - Ability to use local and OpenAI models (ideally it has defaults
| for common local models)
|
| - Chat UX
|
| - Where I can point it to my JS/TS codebase
|
| - It indexes the whole thing including dependencies for RAG.
| Ideally indexing has some form of awareness of model context
| length.
|
| - I can use it for codegen / debugging.
|
| The closest I have found has been aider, but it's python and I
| get into general python hell every time I try and run it.
|
| Would appreciate a suggestion.
| howmayiannoyyou wrote:
| Without Apple Shortcuts support I can't pay for this. I get
| pretty much the same experience from GPT4All. Hoping you add
| support CLI, Shortcuts or something along those lines.
| ferfumarma wrote:
| Is the haiku example a real Haiku?
|
| I think it gives you 4, 7, and 9 syllables in the lines.
|
| I bet you can coax it to give you a better example, if you tinker
| a bit.
| boringg wrote:
| This looks interesting -- might implement it. I'm curious how to
| ensure that it is local only?
| gnomodromo wrote:
| I wonder how much space it takes.
| machiaweliczny wrote:
| Will it work fine on Macbook Air M2 16GB ?
| chaxor wrote:
| This looks fantastic on macos. I like the project.
|
| What does this have that is better than https://github.com/open-
| webui/open-webui ?
| xyst wrote:
| I'll give it a shot. Appreciate the effort on keeping it local.
| maxfurman wrote:
| Won't work on my Intel Macbook :-(
| jedberg wrote:
| The app is great but honestly I'm impressed with the home page!
| Can you go into more details on how you made the home page? What
| did you use to make the screenshots, and are you using any tools
| to generate the HTML/CSS/etc?
| mvdtnz wrote:
| Seriously? It grinds my phone to a near halt just trying to
| scroll from top to bottom. Worse in Firefox but still pretty
| bad in chrome.
| jedberg wrote:
| Interesting. I was using my PC to view it and it was fast and
| beautiful.
| xyc wrote:
| Thanks! honestly it's a quick hack together compared to the
| app. screenshots are from screen.studio. website is built with
| https://astro.build
| matthewmcg wrote:
| The headline had me thinking you had a DIY self-driving car for a
| moment there. Didn't initially register that this was just the
| common metaphor. Looks like a great app.
| bonestamp2 wrote:
| You will sell more if instead of telling us it's for "chatting
| with local AI" you tell us what we can accomplish by chatting
| with local AI. I don't need to chat, I need to get certain tasks
| done. What tasks can it do? (Don't answer me, put it on your
| landing page and app store listing)
| brigleb wrote:
| Cool, instant buy for me. A few little suggestions:
|
| - Make the system font (San Francisco) an option for the UI.
| Maybe even SF Mono as an option as well?
|
| - A little more help about which model to use for beginners would
| be nice. Maybe just an intro screen telling you how to get going.
|
| - Would be great if Command-comma opened settings, like most Mac
| apps.
|
| - Would be great if clicking web links opened Safari (or my
| preferred browser), rather than a small window that loads
| nothing!
| xyc wrote:
| Thank you! and thanks so much for the feature suggestions:
|
| - Make the system font (San Francisco) an option for the UI.
| Maybe even SF Mono as an option as well?
|
| Reasonable request! Won't be too hard to add
|
| - A little more help about which model to use for beginners
| would be nice. Maybe just an intro screen telling you how to
| get going.
|
| Yes better onboarding wizard would definitely make this easier
| for beginners. Don't have much capacity right now, but I'll
| keep this in mind.
|
| - Would be great if Command-comma opened settings, like most
| Mac apps.
|
| Nice suggestion. Will probably get to this when I add some
| keyboard shortcuts like new chat / search etc.
|
| - Would be great if clicking web links opened Safari (or my
| preferred browser), rather than a small window that loads
| nothing!
|
| Ah that's odd, it's supposed to open the link. which link do
| you have if you don't mind sharing? (feel free to email
| support@recurse.chat)
| k2enemy wrote:
| It would be cool to have the option to use the OpenAI API as well
| in the same interface. http://jan.ai does this, so that's what
| I'm using at the moment.
| belgriffinite wrote:
| Ew, Mac only??? This looks awesome but now I'm bummed.
___________________________________________________________________
(page generated 2024-02-28 23:01 UTC)