[HN Gopher] Show HN: Nous - Open-Source Agent Framework with Aut...
___________________________________________________________________
Show HN: Nous - Open-Source Agent Framework with Autonomous, SWE
Agents, WebUI
Hello HN! The day has finally come to stop adding features and
start sharing what I've been building the last 5-6 months. It's a
bit of CrewAI, OpenDevon, LangFuse/Cloud all in one, providing devs
who prefer TypeScript an integrated framework thats provides a lot
out of the box to start experimenting and building agents with. It
started after peeking at the LangChain docs a few times and never
liking the example code. I began experimenting with automating a
simple Jira request from the engineering team to add an index to
one of our Google Spanner databases (for context I'm the DevOps/SRE
lead for an AdTech company). It incudes the tooling we're building
out to automate processes from a DevOps/SRE perspective, which
initially includes a configurable GitLab merge request AI reviewer.
The initial layer above Aider (https://aider.chat/) grew into
coding agent and an autonomous agent with LLM-independent function
calling with auto-generated function schemas. And as testing via
the CLI became unwieldy soon grew database persistence, tracing, a
Web UI and human-in-the-loop functionality. One of the more
interesting additions is the new autonomous agent which generates
Python code that can call the available functions. Using the
pyodide library the tool objects are proxied into the Python scope
and executed in a WebAssembly sandbox. As its able to perform
multiple calls and validation logic in a single control loop, it
can reduce the cost and latency, getting the most out of the
frontier LLMs calls with better reasoning. Benchmark runners for
the autonomous agent and coding benchmarks are in the works to get
some numbers on the capabilities so far. I'm looking forward to
getting back to implementing all the ideas around improving the
code and autonomous agents from a metacognitive perspective after
spending time on docs, refactorings and tidying up recently. Check
it out at https://github.com/trafficguard/nous
Author : campers
Score : 144 points
Date : 2024-08-09 14:16 UTC (1 days ago)
(HTM) web link (github.com)
(TXT) w3m dump (github.com)
| siamese_puff wrote:
| That trace UI is nice
| campers wrote:
| I can't take credit for that particular screen, it's the Trace
| UI in Google Cloud. I did look at LangSmith for tracing, but
| for now I wanted to stick with standard OpenTelemetry tracing,
| so you could export the spans to Honeycomb etc
| tikkun wrote:
| If this isn't by Nous Research, may want to consider renaming
| (https://x.com/NousResearch, https://nousresearch.com/)
| crystal_revenge wrote:
| And if it is is by Nous Research we there definitely needs to
| be clearer branding as this is very confusing.
|
| If OP is not Nous Research (which I suspect to be the case)
| then a name change is a must as they're already a fairly well
| established company in the LLM space (surprised OP isn't aware
| of the name collision already). It's a bit similar to creating
| a new library with the "Smiling Face with Open Hands emoji"[0]
| as your logo
|
| 0. https://emojipedia.org/hugging-face
| downrightmike wrote:
| Never heard of you
| crystal_revenge wrote:
| I'm not affiliated with Nous Research in anyway, but do
| work in the LLM space and at least in this community it's a
| fairly well known org. Since this project _also_ is in that
| space I was just adding support for parent 's observation.
| campers wrote:
| When I first picked the name, after a chat with Claude, I
| hadn't come across Nous Research back then, and they didn't
| show up Googling for just nous.
|
| I see a bit of reuse of words in other various llm related
| projects.
|
| Langchain/langfuse/langflow
|
| Llama/ollama/llamaindex
|
| so I hadn't been too worried about it when became aware of
| them.
|
| That's what Show HN is for, getting feedback, and a name
| changed now would be easy before I post it around more.
| taskforcegemini wrote:
| Nous is the french word for "us". haven't heard of NousResearch
| taskforcegemini wrote:
| but they explain it is from the greek nous which fits better
| for ai
| flir wrote:
| I was surprised when I learned about the Greek derivation.
| In the UK it's slang for common sense ("use your nous,
| mate"). I have to wonder how a bit of Ancient Greek ended
| up as UK slang.
| have_faith wrote:
| Where in the UK is this a thing?
| Citizen_Lame wrote:
| Eton probably.
| arilotter wrote:
| I can confirm this is not a projected related to Nous Research
| in any way, just an unfortunate naming collision
| MeteorMarc wrote:
| And there is https://nous.technology/, known for their smart
| plugs.
| campers wrote:
| Actually when I was bouncing ideas off Claude it was suggested
| with the alternative spelling of noos. Then I can keep the
| concept and only have have one letter to change
| easygenes wrote:
| Cool project!
|
| Just FYI your chosen name collides with Nous Research, which has
| been a prominent player in open weights AI the past year.
| campers wrote:
| Thanks! I posted a reply to another comment about the name
| clash. I thought I could add another weird to differentiate,
| but Nous Agents doesn't really roll off the tongue. New name
| ideas welcome!
| dr_dshiv wrote:
| Noosphere might be cool
| KolmogorovComp wrote:
| Pick a proper noun, you will soar above the plethora of
| startups re-using common nouns for no good reasons (eg:
| "plane" having nothing to do with aeronautics).
| namanyayg wrote:
| This looks too good. I have a B2B AI product, the features that
| exist in Nous easily outclass anything I could make in a
| reasonable timeline.
|
| Maybe I should rewrite my app using Nous...
| campers wrote:
| Thanks! I've spent a lot more time on the computer than I would
| like over the last few months building it.
|
| If you think you might want to feel free to get in touch
| simonw wrote:
| Which definition of "agent" are you using for this project?
| campers wrote:
| Good question, at first I only called the fully autonomous
| agents as agents, as to me that's what having agency is. I
| didn't like when other projects had "multi-agent" when it's
| just a bunch of llm calls.
|
| Initially the coding and software dev agents were called
| workflows, but to make it more agenty I was ok with it being
| called an agent if the result of an llm call affected the
| control flow
| simonw wrote:
| So an agent here is the combination of a system prompt and a
| configured set of tools, kind of like an OpenAI "GPT"?
| MacsHeadroom wrote:
| No, a chat bot using tools (e.g. GPTs) is an "assistant."
|
| An LLM agent is not a chat bot, unlike an assistant. It is
| a primarily or fully autonomous LLM driven application
| which "chats" primarily with itself and/or other agents.
|
| In other words, assistants primarily interact with humans
| while agents primarily interact with themselves and other
| agents.
| SparkyMcUnicorn wrote:
| This looks fantastic! I've been using aider and had my own
| scripts to automate some things with it, but this looks next
| level and beyond.
|
| I wanted to try this out (specifically the web UI), so I
| configured the env file, adjusted the docker compose file, ran
| `docker compose up` and it "just works".
|
| It would be great if there was a basic agent example or two pre-
| configured, so you can set this up and instantly get a better
| sense of how everything works from a more hands-on perspective.
| campers wrote:
| Updating the Dockerfile and docker-compose.yml was the last
| change I made so glad to hear that worked for you! What change
| did you make to the docker compose file?
|
| The CLI scripts under src/cli would be the best examples
| currently to have a look at for running an autonomous agent,
| and the fixed workflows (e.g code.ts)
| SparkyMcUnicorn wrote:
| environment cannot be an empty object, which it is by default
| currently. And I commented out the google cloud line (thanks
| for that code comment).
| viraptor wrote:
| I'm having a hard time figuring out how much logic lives in Nous
| and how much in Aider for code changes - could you say some more
| about it?
|
| Playing with the code agents do far I've found Aider to do many
| silly mistakes and revert its own changes in the next commit of
| the same task. On the other hand Plandex is more consistent but
| can get in a loop of splitting the take into way too small pieces
| and burning money. I'm interested to see other approaches coming
| up.
| campers wrote:
| I have a few steps so far in the code editing at
| https://github.com/TrafficGuard/nous/blob/main/src/swe/codeE...
| There is a first pass I initially created when I was re-running
| a partially completed task and it would sometimes duplicate
| what already had been done. This helps Aider focus on what to
| do. <files>${fileContents}</files>
| <requirements>${requirements}</requirements> You are a
| senior software engineer. Your task is to review the provided
| user requirements against the code provided and produce an
| implementation design specification to give to a developer to
| implement the changes in the files. Do not provide any
| details of verification commands etc as the CI/CD build will
| run integration tests. Only detail the changes required in the
| files for the pull request. Check if any of the
| requirements have already been correctly implemented in the
| code as to not duplicate work. Look at the existing style
| of the code when producing the requirements.
|
| Then there is a compile/lint/test loops which feeds back in the
| error messages, and in the case of compile errors the diff
| since the last compiling commit. Aider added some similar
| functionality recently.
|
| Then finally there's a review step which asks:
| Do the changes in the diff satisfy the requirements, and
| explain why? Are there any redundant changes in the diff? Was
| any code removed in the changes which should not have been?
| Review the style of the code changes in the diff carefully
| against the original code. Do the changes follow all the style
| conventions of the original code?
|
| This helps catch issues that Aider inadvertently introduced, or
| missed.
|
| I have some ideas around implementing workflows that mimic what
| we do. For example if you have a tricky bug, add a .only to the
| relevant describe/it tests (or create tests if they dont exist)
| add lots of logging and assertions to pinpoint the fix
| required, then undo the .only and extra logging. Thats whats
| going to enable higher overall success rates, which you can see
| the progress in the SWE-bench lite leaderboard as simple RAG
| implementations had up to ~4% success rate with Opus, while the
| agentic solutions are reaching 43% pass rate on the full suite.
| hdlothia wrote:
| How much does it cost to run?
| campers wrote:
| To have it deployed costs nothing to run with the Cloud Run and
| Firestore free tier.
|
| As for LLM costs that really depends what you're trying to do
| when it. Fortunately that cost is always coming down. When I
| was first building it with Claude Opus the costs did add up,
| but 100 days later we have 3.5 Sonnet at a fraction of the
| cost.
|
| The Aider benchmarks are good to see how different LLMs perform
| for coding/patch generation. Sonnet 3.5 is best if it's in the
| budget. DeepSeek coder v2 gives the best bang for buck
| https://aider.chat/2024/07/25/new-models.html
| stavros wrote:
| I'm not entirely sure what this does? The initial paragraph goes
| into history and what other platforms do, but it doesn't say what
| problem this will solve for me. Then it continues with some
| features and screenshots, but I still don't know how to use this
| or why.
___________________________________________________________________
(page generated 2024-08-10 23:01 UTC)