[HN Gopher] Show HN: Khoj - Chat offline with your second brain ...
       ___________________________________________________________________
        
       Show HN: Khoj - Chat offline with your second brain using Llama 2
        
       Hi folks, we're Debanjum and Saba. We created Khoj as a hobby
       project 2+ years ago because: (1) Search on the desktop sucked; we
       just had keyword search on the desktop vs google for the internet;
       and (2) Natural language search models had become good and easy to
       run on consumer hardware by this point.  Once we made Khoj search
       incremental, I completely stopped using the default incremental
       search (C-s) in Emacs. Since then Khoj has grown to support more
       content types, deeper integrations and chat (using ChatGPT). With
       Llama 2 released last week, chat models are finally good and easy
       enough to use on consumer hardware for the chat with docs scenario.
       Khoj is a desktop application to search and chat with your personal
       notes, documents and images. It is accessible from within Emacs,
       Obsidian or your Web browser. It works with org-mode, markdown,
       pdf, jpeg files and notion, github repositories. It is open-source
       and can work without internet access (e.g on a plane).  Our chat
       feature allows you to extract answers and create content from your
       existing knowledge base. Example: _" What was that book Trillian
       mentioned at Zaphod's birthday last week"_. We personally use the
       chat feature regularly to find links, names and addresses
       (especially on mobile) and collate content across multiple, messy
       notes. It works online or offline: you can chat without internet
       using Llama 2 or with internet using GPT3.5+ depending on your
       requirements.  Our search feature lets you quickly find relevant
       notes, documents or images using natural language. It does not use
       the internet. Example: Search for _" bought flowers at grocery
       store"_ will find notes about _" roses at wholefoods"_.
       Quickstart:                 pip install khoj-assistant && khoj
       See https://docs.khoj.dev/#/setup for detailed instructions  We
       also have desktop apps (in beta) at https://github.com/khoj-
       ai/khoj/releases/tag/0.10.0 if you want to try them out.  Please do
       try out Khoj and let us know if it works for your use cases?
       _Looking forward to the feedback!_
        
       Author : 110
       Score  : 210 points
       Date   : 2023-07-30 17:14 UTC (5 hours ago)
        
 (HTM) web link (github.com)
 (TXT) w3m dump (github.com)
        
       | mlajtos wrote:
       | Have anyone got something valuable from talking to your second
       | brain? What kind of conversations are you trying to have?
        
         | bozhark wrote:
         | Traumatic Brain Injury. I can't remember yesterday.
         | 
         | Would be hella nice to connect all the scattered lines of
         | thoughts in various notes on a variety of platforms.
        
           | andai wrote:
           | If you're on mac I would strongly recommend Notational
           | Velocity (or the Alt version), if they still run (I know
           | Apple likes to break compatibility).
           | 
           | I've tried dozens of notetaking apps and that's the only one
           | that truly felt like a second brain.
           | 
           | It's because of the speed. Infuriatingly, Obsidian for
           | example can search just as fast, but they intentionally
           | programmed in a lag after each keystroke... (I know because I
           | removed it.)
        
             | mandmandam wrote:
             | Dear Lord, why would they do such a thing. I think I've
             | experienced this, and decided I hated Obsidian because it
             | made my computer feel slow (it's not).
        
             | nmarinov wrote:
             | > they intentionally programmed in a lag after each
             | keystroke
             | 
             | Yeah, it's seems they've added a debounce. I'd prefer to
             | set it to 0ms as well. Do you remember how you removed it?
        
           | mlajtos wrote:
           | I am sorry.
           | 
           | Would some summary of previous day would be helpful to you?
           | Is your memory problem only episodic, or does it extend to
           | factual and kinesthetic as well?
        
             | samstave wrote:
             | I want a body cam that I wear and it transcribes into
             | something searchable from things I did...
             | 
             | Basically like a gopro on steroids with searchable context
             | - or even the ability for me to say outloud "KEEP A NOTE OF
             | THIS" and it will keep a segment tagged and can give me
             | summaries of moments I wanted particularly logged...
             | 
             | I applied to YC with an idea 'sorta' like this almost a
             | decade ago.
             | 
             | The idea was to have a timeline of communications between
             | all my contacts such that I could side-scroll a timeline
             | with dots of actions such a "sent email" "made call" "sent
             | text" received txt" and I could see all these in filters by
             | contacts/day whatever...
             | 
             | This was pre-snowden, so I didnt have confirmation that
             | there were already people doing this for me, just not
             | letting me browse my own data ;-)
        
               | sabakhoj wrote:
               | I quite like this concept. It would be neat if you could
               | relay the data to a personal server for processing and
               | insight extraction. Seems feasible with phone camera. I
               | think gopros would be limited based on battery life (in
               | my experience).
        
               | andai wrote:
               | Yeah, it bugs me that I don't know where I was a year ago
               | but my phone company does.
               | 
               | Can I get that via GDPR? Has anyone tried?
               | 
               | For Android users a more straightforward option is
               | location history, but you should probably turn that off.
        
           | 110 wrote:
           | Wow that's an intense use-case. I don't know how but we'd
           | love to be able to support this.
           | 
           | If you can collate your notes into markdown or some such,
           | then messy notes can be handled, at least using Khoj with
           | GPT3.5+.
           | 
           | Do let us know how we can help out and what your current
           | biggest pain-points are?
        
           | andai wrote:
           | If you're on windows check out TimeSnapper. The classic
           | version is free and works fine.
           | 
           | It screencaps your desktop every 5 sec so you can watch a
           | timelapse of how you spent your day. (Assuming it was on the
           | computer!)
           | 
           | I did find it heavy on the disk usage so I wrote a ffmpeg
           | script to convert it to video (much more efficient).
        
       | spdustin wrote:
       | I see you're using gpt4all; do you have a supported way to change
       | the model being used for local inference?
       | 
       | A number of apps that are designed for OpenAI's completion/chat
       | APIs can simply point to the endpoints served by llama-cpp-python
       | [0], and function in (largely) the same way, while using the
       | various models and quants supported by llama.cpp. That would
       | allow folks to run larger models on the hardware of their choice
       | (including Apple Silicon with Metal acceleration or NVIDIA GPUs)
       | or using other proxies like openrouter.io. I enjoy openrouter.io
       | myself because it supports Anthropic's 100k models.
       | 
       | [0]: https://github.com/abetlen/llama-cpp-python
        
         | 110 wrote:
         | No, we don't yet. Lots of developer folks want to try different
         | models, we want to provide simple to use, but deep assistance.
         | Kind of unsure what to focus on given our limited resources.
        
           | vunderba wrote:
           | I really like the idea of running a dedicated server that
           | serves up various large language models via a standardized
           | API, and then Khoj could just be pointed at one. Depending on
           | the notes and the type of conversation I want to have, that
           | would even allow for Khoj to swap models on the fly.
        
         | syntaxing wrote:
         | The point of gpt4all is that you can change the model with
         | minimal breaking. You should be able to change this line
         | https://github.com/khoj-ai/khoj/blob/master/src/khoj/process...
         | to the model you want. You'll need to build your own local
         | image with docker-compose but should be relatively straight
         | forward.
        
       | LanternLight83 wrote:
       | It's funny that you mention `C-s`, because `isearch-forward` is
       | usually used for low-latency literal matches. In what workflow
       | can Khoj offer acceptable latency or superior utility as a drop-
       | in replacement for isearch? Is there an example of how you might
       | use it to navigate a document?
        
         | 110 wrote:
         | That's (almost) exactly what khoj search provides a search-as-
         | you-type experience but with a natural language (instead of
         | keyword) search interface.
         | 
         | My workflow looks like: 1. Search with Khoj search[1]: `C-c s
         | s` <search-query> RET 2. Use speed key to jump to relevant
         | entry[2]: with `n n o 2`
         | 
         | [1]: `C-c s` is bound to `khoj` transient menu [2]
         | https://orgmode.org/manual/Speed-Keys.html
        
       | umanwizard wrote:
       | Markdown doesn't work on HN...
        
       | asynchronous wrote:
       | This is very cool, the Obsidian integration is a neat feature.
       | 
       | Please, someone make a home-assistant Alexa clone for this.
        
         | 110 wrote:
         | Thanks!
         | 
         | We've just been testing integrating over voice, whatsapp over
         | the last few days[1][2] :)
         | 
         | [1]: https://github.com/khoj-ai/khoj/tree/khoj-chat-over-
         | whatsapp...
         | 
         | [2]: https://github.com/khoj-
         | ai/khoj/compare/master...features/wh...
        
       | jigneshdarji91 wrote:
       | This would be even great if available as a Spotlight Search
       | replacement (with some additional features that Spotlight
       | supports).
        
         | tough wrote:
         | Should be easy to plug it in with a Raycast.app or Alfred.app
         | plugin.
        
       | IshKebab wrote:
       | Interesting. The obvious question you haven't answered anywhere
       | (as far as I can see) is what are the hardware requirements to
       | run this locally?
        
         | 110 wrote:
         | Ah, you're right, forgot to mention that. We use the Llama 2 7B
         | 4 bit quantized model. The machine requirements are:
         | 
         | Ideal: 16Gb (GPU) RAM
         | 
         | Less Ideal: 8GB RAM and CPU
        
       | RomanHauksson wrote:
       | Awesome work, I've been looking for something like this. Any
       | plans to support Logseq in the future?
        
         | 110 wrote:
         | Yes, we hope to get to it soon! This has been an ask on our
         | Github since a while[1]
         | 
         | [1]: https://github.com/khoj-ai/khoj/issues/141
        
       | ramesh31 wrote:
       | Something I've noticed playing around with Llama 7b/13b on my
       | Macbook is that it clearly points out just how little RAM 16GB
       | really is these days. I've had a lot of trouble running both
       | inference and a web UI together locally when browser tabs take up
       | 5GB alone. Hopefully we will see a resurgence of lightweight
       | native UIs for these things that don't hog resources from the
       | model.
        
         | Kwpolska wrote:
         | Or hopefully we will see an end of the LLM hype.
         | 
         | Or at least models that don't hog so much RAM.
        
           | ramesh31 wrote:
           | >Or at least models that don't hog so much RAM
           | 
           | The RAM usage is kind of the point though; we're trading
           | space for time. It's not a problem that the model is using
           | it, it's just that with the default choice for UI being web
           | based now, the unnecessary memory usage of browsers is
           | actually starting to be a real pain point.
        
             | 110 wrote:
             | 1. I hear you on going back to lightweight native apps.
             | Unfortunately the Python ecosystem is not great for this.
             | We use pyinstaller to create the native desktop app but
             | it's a pain to manage.
             | 
             | 2. The web UI isn't required if you use Obsidian or Emacs.
             | That's just a convenient, generic interface that everyone
             | can use.
        
       | overnight5349 wrote:
       | Could this do something like take in the contents of my web
       | history for the day and summarize notes on what I've been
       | researching?
       | 
       | This is getting very close to my ideal of a personal AI. It's
       | only gonna be a few more years until I can have a digital brain
       | filled with everything I know. I can't wait
        
         | usehackernews wrote:
         | Interesting, this is the exact question that came to mind for
         | me. This would address a pain point for me.
         | 
         | Does anyone have recommendations for a tool that does it?
         | 
         | Or, anyone want to build it together?
        
       | agg23 wrote:
       | Just a heads up, your landing page on your website doesn't seem
       | to mention Llama/the offline usecase at all, only online via
       | OpenAI.
       | 
       | ----
       | 
       | What model size/particular fine-tuning are you using, and how
       | have you observed it to perform for the usecase? I've only
       | started playing with Llama 2 at 7B and 13B sizes, and I feel
       | they're awfully RAM heavy for consumer machines, though I'm
       | really excited by this possibility.
       | 
       | How is the search implemented? Is it just an embedding and vector
       | DB, plus some additional metadata filtering (the date commands)?
        
         | 110 wrote:
         | Thanks for the pointer, yeah the website content has gone
         | stale. I'll try update it by end of day
         | 
         | Khoj is using the Llama 7B, 4bit quantized, GGML by TheBloke.
         | 
         | It's actually the first offline chat model that gives coherent
         | answers to user queries given notes as context.
         | 
         | And it's interestingly more conversational than GPT3.5+, which
         | is much more formal
        
           | agg23 wrote:
           | Oh interesting, so you're not using Llama 2, you're using the
           | original. Have you begun to evaluate Llama 2 to determine the
           | differences in performance?
           | 
           | How are you determining what notes (or snippets of notes?) to
           | be injected as context? Especially given the small 2048
           | context limit with Llama 1.
        
             | sabakhoj wrote:
             | Quick clarification, we _are_ using LlamaV2 7B. We didn 't
             | experiment with Llama 1 because we weren't sure of the
             | licensing limitations.
             | 
             | We determine note relevance by using cosine similarity
             | between the query and the knowledge base (your note
             | embeddings). We limit the context window for Llama2 to 3
             | notes (while OpenAI might comfortably take up to 9). The
             | notes are ranked based on most to least similar and
             | truncated based on the context window limit. For the model
             | we're using, we're still limited to 2048 tokens for Llama
             | v2.
        
           | jmorgan wrote:
           | This is a super cool project. Congrats! If you're looking at
           | trying different models with one API check out an open-source
           | project a few folks and I have been working on in July in
           | case it's helpful https://github.com/jmorganca/ollama
           | 
           | Llama 2 gives great answers, even the 7B model. There's an
           | "uncensored" 7B version as well George Sung has fine-tuned
           | for topics that the default Llama2 model won't discuss - eg I
           | had trouble having Llama2 review authentication/security code
           | or topics:
           | https://huggingface.co/TheBloke/llama2_7b_chat_uncensored-
           | GG...
           | 
           | From just playing around with it the uncensored model still
           | seems to know where to "draw the line" on sensitive topics
           | but YMMV
           | 
           | If you do end up checking out Ollama you can try it with with
           | this command or there's an API too (it's not in the docs yet)
           | ollama run llama2-uncensored
        
       | matmulbro wrote:
       | [flagged]
        
         | isoprophlex wrote:
         | Please don't post low effort, shallow dismissals; without
         | substantiation you're not posting anything useful, you're just
         | a loud asshole.
        
       | coder543 wrote:
       | This seems like a cool project.
       | 
       | It would be awesome if it could also index a directory of PDFs,
       | and if it could do OCR on those PDFs to support indexing scanned
       | documents. Probably outside of the scope of the project for now,
       | but just the other day I was just thinking how nice it would be
       | to have a tool like this.
        
         | 110 wrote:
         | Yeah being able to search and chat with PDF files is quite
         | useful.
         | 
         | Khoj can index directory of PDFs for search and chat. But it
         | does not currently work with scanned PDF files (i.e not with
         | ones without selectable text).
         | 
         | Being able to work with those would be awesome. We just need to
         | get to it. Hopefully soon
        
           | adr1an wrote:
           | Check pdftotext it's a CLI tool (maybe a library too) that
           | makes pdf text selectable. Oh sorry, I meant to say ocrmypdf.
           | But hey, maybe it's worth checking both.
        
         | samstave wrote:
         | Ive wanted a crawler on my machine for auto-categorizing and
         | organizing, tagging and moving ALL my files around based on all
         | my machines - so the ability to crawl PDFs, downloads,
         | screenshots, pictures, etc and give me a logical tree of the
         | org of the files - and allow me to modify it by saying "add all
         | PDF related to [subject] here and the organize by source/author
         | etc... and then move all my screenshots, ordered by date here
         | 
         | etc...
         | 
         | I've wanted a "COMPUTER.", uh... I say "COMPUTER!", ' _sir, you
         | have to use the keyboard_ ', ah a Keyboard, how quaint....
         | forever.
        
           | 110 wrote:
           | That.would.be.awesome! Khoj isn't their yet, but that
           | actually shouldn't be too far away if you give it a voice
           | interface and terminal access.
           | 
           | Of course, having it be stable enough to not `rm -rf /` soon
           | after is definitely not part of the warranty
        
       | mmanfrin wrote:
       | As someone who's been getting int o using Obsidian and messing
       | around with chat ais, this is excellent, thank you!
        
         | tarwin wrote:
         | Really encourages me to move to Obsidion :D
        
         | 110 wrote:
         | Thanks! Do try it out and let us know if it works for your use-
         | case?
        
       | Ilnsk wrote:
       | [flagged]
        
         | tudorw wrote:
         | hi, you seem keen to share something neat you took less than 10
         | minutes to implement, I'd love to see that?
        
         | 110 wrote:
         | 2.5 years! We're kind of slow :P
        
       | calnayak wrote:
       | How does one access this from a web browser?
        
       | wg0 wrote:
       | I have not tried it but something like this should exist. I don't
       | think it is going to be as useable on consumer hardware as yet
       | unless you have a good enough GPU but within couple of years (or
       | less), we'll be there I am sure.
       | 
       | Irrelevant opinion - The logo is beautiful, I like it and so are
       | the colours used.
       | 
       | Lastly, LLMA2 for such use cases, I think is capable enough that
       | paying for ChatGPT won't be as lucrative especially when privacy
       | is of concern.
       | 
       | Keep it up. Good craftsmanship. :)
        
         | sabakhoj wrote:
         | Thanks! I do think Llama V2 is going to be a good enough
         | replacement for ChatGPT (aka GPT3.5) for a lot of use cases.
        
       | bozhark wrote:
       | I'm not a software dev.
       | 
       | Is there a way to have this bot read from a discord and google
       | drive?
        
         | syntaxing wrote:
         | gpt4all itself (the library on the backend for this) has a
         | similar program [1]. You just need to put everything into a
         | folder. This should be straight forward for google drive.
         | Harder for discord though but I'm sure theres a bot online that
         | can do the extraction.
         | 
         | [1] https://gpt4all.io/index.html
        
       ___________________________________________________________________
       (page generated 2023-07-30 23:00 UTC)