[HN Gopher] Perplexica: Open-source Perplexity alternative
       ___________________________________________________________________
        
       Perplexica: Open-source Perplexity alternative
        
       Author : sean_pedersen
       Score  : 313 points
       Date   : 2024-05-24 02:49 UTC (20 hours ago)
        
 (HTM) web link (github.com)
 (TXT) w3m dump (github.com)
        
       | hackernewds wrote:
       | How is this related to Perplexity?
        
         | asadm wrote:
         | It's open source version of Perplexity.ai
        
         | viraptor wrote:
         | Does the first paragraph of the page answer the question?
        
       | allanrbo wrote:
       | First I'm hearing of the meta search engine SearxNG too. Neat.
       | Feel like we've come full circle, going back to meta search
       | engines again.
        
         | sorokod wrote:
         | Same here, here is a list of public instances[1]. Docs link
         | [2].
         | 
         | [1] https://searx.space/
         | 
         | [2] https://docs.searxng.org/
        
       | KRAKRISMOTT wrote:
       | Can you add support for Serp API? I prefer to pay for a managed
       | proxy farm instead of using SearxNG which requires too much
       | babysitting.
        
         | throwanem wrote:
         | Oh interesting, with the collapse of Google result quality
         | lately I've been thinking about trying out SearxNG in my
         | homelab. If you want to expand on the headaches you've run
         | into, I'd be interested to hear!
        
       | reckless wrote:
       | Looks similar to what I've been using for a few weeks
       | https://github.com/miurla/morphic
        
         | number6 wrote:
         | Is it worth the install or is it just a gimmick?
        
           | hackerlight wrote:
           | No need.to install, go to www.morphic.sh
        
             | insane_dreamer wrote:
             | just tried it; sweet!
        
       | pants2 wrote:
       | I made my own version of this for personal use some time ago,
       | it's a fun project! I use Kagi for the search backend and
       | Colly/ScrapingFish (which has plans starting at $2) for getting
       | the content. Both work really well!
        
         | tremarley wrote:
         | Release it please
        
         | mdp2021 wrote:
         | An article would be nice...
        
       | yosito wrote:
       | It would be awesome if this could also search my Obsidian notes
       | at the same time, and if it worked seamlessly on all of my
       | devices.
        
         | tony-vlcek wrote:
         | Logseq user here with an upvote.
        
           | anewhnaccount2 wrote:
           | I guess it didn't eat your notes yet:
           | https://discuss.logseq.com/t/data-loss-happened-twice-i-
           | cant...
        
             | jddj wrote:
             | I've had Google photos eat ~12 months worth of pictures, I
             | don't really trust anyone to keep my data safe.
             | 
             | Logseq has been fine for me over several years, but it also
             | makes it extremely easy to auto-commit to git.
        
             | jhund wrote:
             | Logseq's git auto-commit is a great insurance policy and
             | should make recovery a breeze.
        
             | eichin wrote:
             | Ah, those appear to be entirely multi-device-sync problems.
             | (I use logseq with git-autocommit for storage and backup -
             | but since the multi-node sync stuff wasn't available for
             | self hosting anyway, I've never tried it, and thus dodged
             | the problem entirely. Obviously for a lot of people multi-
             | device use is the entire point, but for some of us, logseq
             | is "just an editor"...)
        
           | spiralk wrote:
           | I actually do something like this with my logseq notes. Since
           | all of the files are .md in a directory, one can load them
           | all in to a vectordb and use it for RAG. Logseq has an API
           | too but using the .md files is easy.
        
         | nine_k wrote:
         | In this regard,maybe JetBrains should dust off Omea [1] and add
         | an appropriate LLM to it.
         | 
         | [1]: https://www.jetbrains.com/omea/
        
       | webprofusion wrote:
       | When making an alternative to something, don't reference the name
       | of the thing you're copying if that thing has (or can afford) a
       | legal team to protect their brand. If your product can reasonably
       | be confused with the original (it can) they will eat your soul.
        
         | qxfys wrote:
         | +1
        
         | michelsedgh wrote:
         | Actually I loved it. I dont think they have any grounds to sue.
         | Its different and close enough. Also they wouldn't sue a
         | project on github, if they do they show their faces its worse
         | for them. Also many forks will happen and they have to sue
         | many. Worst case you change the name of the repo. Thats the
         | power of open source ;)
        
           | tartrate wrote:
           | Isn't Yuzu a good counter example?
        
             | freehorse wrote:
             | It does not sound relevant to me, because that was a case
             | of "video game piracy". It was not about the name per se.
        
             | theturtletalks wrote:
             | Yuzu's downfall was not the repo, it was their Discord.
             | They were sharing DRM cracking keys on there and getting
             | paid $30K/month on Patreon. It's the same reason most
             | emulators require you to bring your own BIOS.
        
         | iforgotpassword wrote:
         | Huh? So reactos shouldn't say they build an alternative to
         | Windows? As long as you build it yourself and don't steal any
         | resources or secrets, there is no problem mentioning that it's
         | an alternative or replacement for another product. What's much
         | more dangerous is picking a name for your own product that
         | resembles the original.
        
           | TeMPOraL wrote:
           | Wonder how Gitlab survived next to Github then. To first
           | approximation, the names are the same, and so are the
           | products...
        
             | mihaic wrote:
             | Both of these are built on top of git, an open source
             | project, so Gitlab is not a riff on Github. Perplexica on
             | the other hand seems like a direct reference to Perplexity,
             | not on the concept of being perplexed by something.
        
               | TeMPOraL wrote:
               | Isn't "Perplexity" itself a direct reference to a machine
               | learning term that, among other things, is very relevant
               | to large language models, on top of which Perplexity is
               | built?
        
               | IanCal wrote:
               | That's a far more tenuous link than "gitlab hosts git
               | repos".
        
               | wrasee wrote:
               | Yet the way git is used is still similar. Both lead with
               | 'git' in their name, both append a pithy three letter
               | suffix to 'git' that both describe some kind of space
               | where people meet to do stuff. Surely that's more than
               | just coincidence.
        
             | jakozaur wrote:
             | The dispute happens only if one party owns the trademark
             | and sends a Cease & Desist letter. Different companies have
             | different approaches to aggression here.
             | 
             | Second, it has to prove that it confuses customers (e.g. if
             | you pick ten end users and do tests if they find that
             | confusing). Maybe a sophisticated tech audience is better
             | at finding differences than the general public.
        
             | digi59404 wrote:
             | Git is a registered trademark of neither GitLab or GitHub.
             | Both GitLab and GitHub have negotiated the usage of the Git
             | trademark. Provided they follow the rules set out for them,
             | they can continue to use it.
             | 
             | As an employee of one of them I personally bought the
             | git.new domain. I paid a good chunk for it and was going to
             | build a new project template builder on it. I got.. talked
             | too by legal about this. Because as an employee it actually
             | violated one of those rules.
             | 
             | So that's the how, and why I know.
        
           | twobitshifter wrote:
           | More that they should not have called it Windowz
        
           | rcxdude wrote:
           | You can reference the competitor, but you don't want there to
           | be any risk that a moron in a hurry might confuse your
           | product with theirs, else you're in for a trademark
           | violation.
        
       | michelsedgh wrote:
       | Thank you so much for posting this and ofc the creators. My
       | brother and I were in a debate and this just proved my point.
       | Feels real good to see it. Cant wait to try it ;)
        
       | Cilvic wrote:
       | This is cool. My biggest question was "does it work?" then I had
       | another look at the repo and saw the "Repocloud" one click
       | deployment. And it's quite well done. Apart from signin up for
       | the repocloud account (3$ free credit) and waiting for the
       | deployment (5mins) ... I'm now waiting for my first answer which
       | doesn't seem to come through and there are not a many ways to
       | trouble shoot as far as I can see... I've asked on discord
        
       | sanjayk0508 wrote:
       | There's been many other good alternative of perplexica before
        
         | michelsedgh wrote:
         | Care to share which ones?
        
           | telepathy wrote:
           | https://repocloud.io/results/?category=saas_Perplexity
        
             | bravura wrote:
             | Why doesn't this site have any way to contact the
             | maintainer?
             | 
             | Even their TOS makes it seem like they aren't an actual
             | company (the counterparty is "RepoCloud.io")
        
       | jakozaur wrote:
       | Sorry to say, but this looks like a trademark violation. Though
       | the project may be cool, it immediately put me off:
       | 
       | https://www.trademarkia.com/perplexityai-98400215
       | 
       | I'm not a lawyer, but trademarks are well protected. You can
       | provide similar services and confuse customers by using almost
       | identical names. Don't do Gooogle search engine, Macrosoft OS,
       | etc.
       | 
       | If they will get traction, Perplexity could force them to
       | rebrand.
        
         | Terretta wrote:
         | Perplexity is an information theory term, not a brand:
         | 
         |  _Perplexity of a probability model -- A model of an unknown
         | probability distribution p, may be proposed based on a training
         | sample that was drawn from p. Given a proposed probability
         | model q, one may evaluate q by asking how well it predicts a
         | separate test sample x1, x2, ..., xN also drawn from p._
         | 
         | https://en.wikipedia.org/wiki/Perplexity
        
           | marcinzm wrote:
           | That doesn't mean in any way that it can't be a legal
           | trademark.
        
             | calny wrote:
             | I'm an IP lawyer & AI dev: my first reaction was, "hmm
             | there are trademark issues here." From a US perspective:
             | "Perplexity" certainly CAN be a trademark, and the company
             | has applied for one--to my knowledge it's still pending. If
             | the term was merely "descriptive" of the service provided,
             | like "American Airlines", then the company would need to
             | show that the term has acquired distinctiveness: ie, that
             | purchasers associate the term with that specific company.
             | But perplexity is probably more than merely descriptive
             | here.
             | 
             | Assuming that they have a valid trademark, the issue
             | becomes whether there is a likelihood of confusion between
             | Perplexity and Perplexica. That is a fact-specific,
             | multifactor test, which I'll spare you. But there could be
             | arguments both ways IMO
             | 
             | EDIT: trademark issues aside, cool project!
        
           | llamaimperative wrote:
           | Not how the law works. I'm not certain Perplexity has
           | trademarked their name but the question of whether it's an
           | information theory term or not wouldn't prevent them from
           | doing so, nor would it prevent them from defending that
           | trademark.
           | 
           | Engineer-y people trying to interpret law has to be one of
           | the most reliably silly things on HN.
        
             | michael-ax wrote:
             | Have you ever tried to trademark a random noun?
        
               | marcinzm wrote:
               | No but lots of other people have:
               | https://tmsearch.uspto.gov/search/search-results
               | 
               | Feel free to release a computer named Apple to prove me
               | wrong.
        
               | michael-ax wrote:
               | Alright, read up on domains, then try arguing that
               | 'perplexity' as company and noun are in different spaces!
               | I grant you that if they were, the company could
               | trademark that noun. But it seems clear that Perplexity
               | named itself after the noun and by so doing gave up the
               | option of trademarking its company name.
        
             | mdp2021 wrote:
             | > _Engineer-y people trying to interpret law_
             | 
             | It must be out of how perplexing apparent hiatus between
             | legitimacy and positive law can be.
        
           | michael-ax wrote:
           | Which is why Trademarks are a non-issue here. My bet is that
           | the Devs understood that.
        
       | chakintosh wrote:
       | I've been using Perplexity for months now on the Free tire (with
       | the 5 Pro searches/4 hours) and its been plenty for me and I use
       | it has completely replaced google for me. So I'm not sure where
       | Perplexica fits in my use case, especially that I'll have to
       | install and maintain it and use lesser models than Perplexity.
        
         | hosh wrote:
         | Some people want to self-host this technology. AI is very
         | powerful, and not everyone wants that to be controlled by large
         | corporations or institutions.
        
       | rvz wrote:
       | Both Perplexica and Perplexity are bad names for a search engine.
       | 
       | Very perplexed as to who was the smart person that chose this
       | dreadful name for the company.
       | 
       | Yes, it has another definition in context to information theory;
       | which my point is, I used the first definition like a normal
       | person would, which is commonly associated with...
       | 
       |  _'...a state of confusion or a complicated and difficult
       | situation or thing. ' - Cambridge English Dictionary_ [0]
       | 
       | None of them can ever become a verb that makes sense like 'google
       | it'.
       | 
       | [0]
       | https://dictionary.cambridge.org/dictionary/english/perplexi...
        
         | robertlagrant wrote:
         | I Encartered the concept of verbs and I agree.
        
         | rors wrote:
         | Perplexity is term from information theory. It's one measure of
         | the quality of an LM. I.e. how perplexed is my model? To an
         | experienced researcher it's a unit of measurement like metres
         | or kg. https://en.wikipedia.org/wiki/Perplexity
         | 
         | I agree that it doesn't transfer out of that specialised
         | domain.
        
           | Aloisius wrote:
           | Eh. Still a weird name given one generally wants to reduce
           | perplexity.
           | 
           | Might as well call it Uncertainty.
        
         | mdp2021 wrote:
         | 'Perplexity' is "through the complexity".
        
       | tcsenpai wrote:
       | I was waiting for this moment since months. Sir, you are the GOAT
        
       | dcreater wrote:
       | Anyone used it yet? Was posted here a while back. I'm interested
       | to hear whether it works and how good it is rather than many
       | "this looks great" comments. Perplexity.ai itself has been pretty
       | poor for me after I got past the honeymoon phase
        
       | rashadphil wrote:
       | Here is another open-source alternative:
       | https://github.com/rashadphz/farfalle (Disclaimer: I made it)
        
         | asadalt wrote:
         | which one is better
        
       | octobus2021 wrote:
       | I absolutely love this and will try as many as possible very
       | soon. I think "intelligent search" (asking LLM questions to
       | search on the Web by communicating, preferably by voice) is one
       | of the few solid use cases for LLM. I hate the idea of having
       | this happen in the cloud with someone having my data, so doing
       | this locally with my local LLM would be ideal.
        
         | jahewson wrote:
         | The web search itself till happening on the cloud though? And
         | instead of searching one provider it now searches multiple...
         | not sure how much better this is really.
        
         | abdullahkhalids wrote:
         | What would be even better, if it could also search my local
         | repository of ebooks and pdfs. Most of the stuff I do, needs
         | serious answers from books or papers I have already selected.
         | Random webpages on the web don't cut it.
         | 
         | Citing the book section/page/paragraph would be magic.
        
         | nilkn wrote:
         | Even after the release of GPT4o, Perplexity Pro with Claude 3
         | Opus is by far my most used LLM application. For me, the
         | writing quality of Claude 3 combined with a wider variety of
         | information sources makes it far surpass raw ChatGPT for most
         | non-creative/non-interactive tasks.
        
           | bufo wrote:
           | I recommend Phind.com, it's been much better and faster for
           | me than Perplexity Pro. I typically use their custom 70B
           | model but you can also use GPT4 o or Turbo, or Claude 3 Opus.
        
       | fudged71 wrote:
       | Are there any benchmarks to compare these online research agents?
       | There's so many to choose from now but it's hard to compare them
        
       | behnamoh wrote:
       | It was about time someone made an alternative to Perplexity.
        
       | jon309 wrote:
       | Super cool! I would love if we could make this serverless and
       | easily deployable with CDK or Terraform. Maybe I'll take that up
       | as a side project, who knows!
        
       ___________________________________________________________________
       (page generated 2024-05-24 23:00 UTC)