hngopher.com

       [HN Gopher] Show HN: PreCog AI - Automatic AI Model Selection fo...
       ___________________________________________________________________
        
       Show HN: PreCog AI - Automatic AI Model Selection for Any Task
        
       Hi HN,  I'm one of the co-founders of PreCog AI, a project my
       friend and I started to make the best AI models more accessible.
       PreCog AI is a chatbot that automatically picks and answers with
       the best AI model for whatever task you throw at it.  We made
       PreCog public on Monday and are getting great feedback. Originally
       built as an internal tool to help our small team reduce costs
       (paying for various chatbots) and get better AI output, PreCog has
       helped us so much with our workflow and ideation that we just had
       to share it.  Key Features of PreCog - AI Model Matchmaking: With
       access to 18 models, PreCog automatically matches your questions
       with the most fitting AI model based on the task.  - Versatile
       Adaptation: Works with any task, from coding to creative writing,
       giving you the right tool for the job.  -Ongoing Updates: Stay
       current with AI advancements using the latest LLM leaderboard data
       (we are constantly adding and changing our leaderboard). See the
       leaderboard here - https://precog.ubik.studio/leaderboard
       -Preferred Model Selection: If you have a preferred model, choose
       it, and PreCog will use that model exclusively to respond.  How
       PreCog Works:  PreCog analyzes your query, references the model
       leaderboard, and then matches your query with the highest-ranked AI
       for that niche task. Delivering high-quality, task-specific output.
       PreCog's Model Leaderboard ranks AI models through over a million
       human comparisons, evaluated and presented on an Elo-scale. The
       dataset used to build the PreCog Leaderboard is from ChatBot Arena
       by https://lmarena.ai/. Researchers from UC Berkeley SkyLab and
       LMSYS developed the battle framework to produce the dataset.  I
       love feedback, questions, and critiques it helps me and my friend
       develop with the user in mind.  You can reach me at anytime at:
       info@ubik.studio
        
       Author : ieuanking
       Score  : 40 points
       Date   : 2024-10-24 17:25 UTC (5 hours ago)
        
 (HTM) web link (precog.ubik.studio)
 (TXT) w3m dump (precog.ubik.studio)
        
       | dvfjsdhgfv wrote:
       | I entered the query but was redirected to a login form. If you
       | are honestly looking for feedback and no leads, unblock the app
       | temporarily for HN. If you are for leads, for sure you will get
       | some if this submission receives enough upvotes, but I wouldn't
       | count on many. These days people are not so keen to leaving their
       | data on random websites anymore.
        
         | ieuanking wrote:
         | Thanks so much for the feedback (new here for sure) - we are
         | working on that rn, updating as fast possible gotta rebuild
         | some stuff lmao
        
         | ieuanking wrote:
         | Just took down the sign up for anyone who wants try it out!
         | Thanks again for pointing that out - new to posting on here,
         | super helpful hope you get some use out of our project.
        
       | swyx wrote:
       | you're doing model routing - any thoughts on
       | https://github.com/lm-sys/RouteLLM and Martian?
       | 
       | hope you dont raise funding before you figure out what they
       | haven't
        
         | ieuanking wrote:
         | we have looked at route llm and played with it but we felt like
         | they were focusing on cost minimization, we think theres more
         | work to be done in focusing on routing for the best possible
         | output without the cost constraint. Also just trying to provide
         | easy ways for people to use tools like route llm that dont
         | require coding knowledge. That being said we def wanna release
         | a benchmark of our routing in comparison to some of the pre-
         | trained defaults in route llm
        
           | swyx wrote:
           | cool. its a very small distinction in my mind to flip from
           | one to the other. all the best.
        
       | okintheory wrote:
       | Ah yes, Minority Report, that story we're so eager to repeat.
        
         | ieuanking wrote:
         | fortunately we aren't policing anybody :* (we do love PKD tho)
         | we thought precog perfectly described the function
        
           | JasonSage wrote:
           | I think it's a great name, it really does describe the
           | function perfectly.
           | 
           | I got a huge smile when I saw it.
        
             | ieuanking wrote:
             | tysm - I got a huge smile reading this comment.
        
           | frmersdog wrote:
           | That doesn't really change the fact that it's exhausting (and
           | worse, "commercially offputting") to be reminded that we're
           | careening towards the worst futures literally imagined. I
           | stayed away from Soylent and I'll probably stay away from
           | this, but thanks for the head's up. _rimshot_
        
           | KingFelix wrote:
           | PKD is rad, my username is also a pkd reference. Love the
           | Ubik studio as well!
        
       | KaoruAoiShiho wrote:
       | Do you find that there are a lot of variations and intricacies in
       | deciding which LLM to use? I find it pretty simple to just assume
       | sonnet is best at most coding jobs, o1 best at complicated non-
       | coding tasks, 4o for simple questions that don't require
       | planning. and well, tbh that's it, no other LLMs are that
       | interesting.
        
         | ieuanking wrote:
         | We found that there is more nuance when it comes to different
         | languages, we also update our leaderboard as new models drop so
         | the best models are always available through PreCog. It's also
         | nice to not have to tab switch constantly between chatbots and
         | instead just use them in one central place. We added a feature
         | so you can just pick the model you'd like to chat with if you
         | dont want to be automatically matched with the leaderboard
         | rankings (all matched responses provide an explanation in the
         | reasoning behind the match so you can see why the model was
         | chosen).
        
         | cma wrote:
         | I think o1 is best at algorithm heavy coding tasks, opus at
         | api/language-translation knowledge-breath heavy tasks (and also
         | has better recency via later knowledge cutoff). But with latest
         | update opus is pretty close at algorithm heavy coding.
        
           | ieuanking wrote:
           | We have plans to put o1 on the leaderboard - but Opus is
           | there RN! We should have a new ELO ranking soon, but the
           | second o1 ranking are done it will be on our leaderboard.
        
       ___________________________________________________________________
       (page generated 2024-10-24 23:00 UTC)