https://e2b.dev/blog/crewai-vs-autogen-for-code-execution-ai-agents Docs Changelog Blog Discord Docs Changelog Blog Discord Docs Changelog Blog Discord Insights Insights Insights Feb 16, 2024 Tereza Tizkova Tereza Tizkova Tereza Tizkova CrewAI vs AutoGen for Code Execution AI Agents [VTTFemU8s2] [VTTFemU8s2] [VTTFemU8s2] A new paper More Agents Is All You Need finds that, simply via a sampling-and-voting method, the performance of LLMs scales with the number of agents instantiated. This can imply that the popularity of multi-agent frameworks is justified. CrewAI, also called AutoGen 2.0, is a recently popular multi-agent framework. I tested CrewAI and compared it to AutoGen, mainly regarding the LLM-generated code execution capabilities. [ZENZwp7yz2lDR8thDShMps2NU] GitHub stars rating evolution of AutoGen and CrewAI CrewAI is built on top of LangChain and allows one to orchestrate multiple agents working on a user-defined task. Same as AutoGen, CrewAI is open-source and uses the concept of agents with different roles, but on top of that, CrewAI allows agents to delegate work to each other. [aDxmqmhdK3LFo27uR70IOnaroY] Working model of CrewAI agents. Source Why the hype? There are several explanations for CrewAI's popularity. It is quick to set up, and works well for a variety of interesting use cases with clear guides and demos, e.g.: * Stock Analysis * Creating Instagram posts * Trip Planner * Landing Page Generator. Code execution comparison AutoGen What I like about AutoGen is that it is execution-capable of the code output it produces. That is, when I wanted to analyze and visualize a dataset, AutoGen agents generated a code for it, executed the code via Docker, and saved the resulting chart as a PDF file on my computer. [yznPo8k16lA8OAgiis044kNC4] AutoGen code execution feature used for generating a chart for stock prices By default, AutoGen currently uses Docker containers to execute Python code. They even added a Code Interpreter example made with a new (experimental) agent called the GPTAssistantAgent that lets you add the new OpenAI assistants into AutoGen-based workflows. Executing LLM-generated code locally via Docker may be limiting o for some use cases and possesses some risks, but there exists a cloud alternative. In this open-source code interpreter example, the code produced by AutoGen agents is running in an isolated cloud environment. CrewAI When asked for similar data analysis tasks, CrewAI by default generates a text report. It works well with search tools like LangChain DuckDuckGo Search, but to perform more complex data analysis tasks, it would need tools that allow code execution of the LLM-generated code. Example of a stock analysis task performed by CrewAI. Source [U2ih06UL8wL1o3PjjR4XSco0y4] Another example of CrewAI performing a stock analysis task. Source I haven't found a quick way to add such tools, but it still should be possible to integrate them. In some examples, like generating a landing page, CrewAI uses other (custom) tools, like writing a new file with content. [rMXXkDezcLZqMZ9chXVELughvlU] Source LangChain tools for code execution Lang Chain offers several Tools where LLM-generated code gets automatically executed. One example is the Pandas Dataframe where a Python agent is used to execute the LLM-generated Python code. Another example is Python REPL which can execute Python commands. There is even one Langchain tool for remote code execution. Bearly Code Interpreter allows safe LLM code execution by evaluating Python code in a sandbox environment. This environment resets on every execution. Apart from these, users can even build their custom Langchain tools for code execution and add them to CrewAI. In conclusion, LangChain tools are able to execute code snippets for example via the Python runtime environment. Limitations Running LLM-generated code can pose a security risk in general. Either because a user asks the LLM to generate malicious code or the LLM generates malicious code accidentally. Even the official LangChain tool Pandas Dataframe explicitly mentions "This can be bad if the LLM generated Python code is harmful. Use cautiously." [FS5xeuM3mhqLm30WzN6ae9sYgw] Source [0XACPN6zz3t2Yw2Zeee9IUbAOQ] Source LangChain recently received feedback to add a more secure way of running the LLM-generated code, e.g., the same way AutoGen does. Conclusion I can understand the popularity of both AutoGen and CrewAI as they have proven the ability to deliver some interesting and useful examples quickly. While CrewAI is younger than AutoGen, it would be cool to see benchmarks and evals from both frameworks to make it easier for developers to make the right decision when deciding. I heard from some developers that they chose CrewAI because they were already familiar with LangChain, and others argued that AutoGen is more customizable. However, when discussing with developers, most said that they don't see a big difference between CrewAI and AutoGen as they accomplish similar tasks. E2B E2B is building the cloud for AI agents. A platform and infrastructure where AI agents can act autonomously and as the first class citizen. By E2B Map of AI Agents Map of Agents' SDKs ChatGPT Plugin Smol Developer in Cloud Links GitHub Twitter Discord LinkedIn Company Contact Blog Changelog (c)2023 FoundryLabs, Inc. All rights reserved. E2B E2B is building the cloud for AI agents. A platform and infrastructure where AI agents can act autonomously and as the first class citizen. By E2B Map of AI Agents Map of Agents' SDKs ChatGPT Plugin Agent Protocol Smol Developer in Cloud Links GitHub Twitter Discord LinkedIn Company Contact Blog Changelog (c)2023 FoundryLabs, Inc. All rights reserved. E2B E2B is building the cloud for AI agents. A platform and infrastructure where AI agents can act autonomously and as the first class citizen. By E2B Map of AI Agents Map of Agents' SDKs ChatGPT Plugin Agent Protocol Smol Developer in Cloud Links GitHub Twitter Discord LinkedIn Company Contact Blog Changelog (c)2023 FoundryLabs, Inc. All rights reserved.