https://e2b.dev/blog/crewai-vs-autogen-for-code-execution-ai-agents

 
 

Docs

 

Changelog

 

Blog

 

Discord

 
 

Docs

 

Changelog

 

Blog

 

Discord

 
 

Docs

 

Changelog

 

Blog

 

Discord

Insights

Insights

Insights

Feb 16, 2024

Tereza Tizkova

Tereza Tizkova

Tereza Tizkova


CrewAI vs AutoGen for Code Execution AI Agents



 
[VTTFemU8s2]
[VTTFemU8s2]
[VTTFemU8s2]

A new paper More Agents Is All You Need finds that, simply via a
sampling-and-voting method, the performance of LLMs scales with the
number of agents instantiated. This can imply that the popularity of
multi-agent frameworks is justified.

CrewAI, also called AutoGen 2.0, is a recently popular multi-agent
framework. I tested CrewAI and compared it to AutoGen, mainly
regarding the LLM-generated code execution capabilities.

[ZENZwp7yz2lDR8thDShMps2NU]

GitHub stars rating evolution of AutoGen and CrewAI


CrewAI is built on top of LangChain and allows one to orchestrate
multiple agents working on a user-defined task.

Same as AutoGen, CrewAI is open-source and uses the concept of agents
with different roles, but on top of that, CrewAI allows agents to
delegate work to each other.

[aDxmqmhdK3LFo27uR70IOnaroY]

Working model of CrewAI agents. Source

Why the hype?

There are several explanations for CrewAI's popularity. It is quick
to set up, and works well for a variety of interesting use cases with
clear guides and demos, e.g.:

  * Stock Analysis

  * Creating Instagram posts

  * Trip Planner

  * Landing Page Generator.

Code execution comparison

AutoGen

What I like about AutoGen is that it is execution-capable of the code
output it produces. That is, when I wanted to analyze and visualize a
dataset, AutoGen agents generated a code for it, executed the code
via Docker, and saved the resulting chart as a PDF file on my
computer.

[yznPo8k16lA8OAgiis044kNC4]

AutoGen code execution feature used for generating a chart for stock
prices


By default, AutoGen currently uses Docker containers to execute
Python code. They even added a Code Interpreter example made with a
new (experimental) agent called the GPTAssistantAgent that lets you
add the new OpenAI assistants into AutoGen-based workflows.

Executing LLM-generated code locally via Docker may be limiting o for
some use cases and possesses some risks, but there exists a cloud
alternative. In this open-source code interpreter example, the code
produced by AutoGen agents is running in an isolated cloud
environment.


CrewAI

When asked for similar data analysis tasks, CrewAI by default
generates a text report. It works well with search tools like
LangChain DuckDuckGo Search, but to perform more complex data
analysis tasks, it would need tools that allow code execution of the
LLM-generated code.

Example of a stock analysis task performed by CrewAI. Source


[U2ih06UL8wL1o3PjjR4XSco0y4]

Another example of CrewAI performing a stock analysis task. Source


I haven't found a quick way to add such tools, but it still should be
possible to integrate them. In some examples, like generating a
landing page, CrewAI uses other (custom) tools, like writing a new
file with content.

[rMXXkDezcLZqMZ9chXVELughvlU]

Source

LangChain tools for code execution

Lang Chain offers several Tools where LLM-generated code gets
automatically executed.

One example is the Pandas Dataframe where a Python agent is used to
execute the LLM-generated Python code.

Another example is Python REPL which can execute Python commands.

There is even one Langchain tool for remote code execution. Bearly
Code Interpreter allows safe LLM code execution by evaluating Python
code in a sandbox environment. This environment resets on every
execution.

Apart from these, users can even build their custom Langchain tools
for code execution and add them to CrewAI.

In conclusion, LangChain tools are able to execute code snippets for
example via the Python runtime environment.

Limitations

Running LLM-generated code can pose a security risk in general.
Either because a user asks the LLM to generate malicious code or the
LLM generates malicious code accidentally.

Even the official LangChain tool Pandas Dataframe explicitly mentions
"This can be bad if the LLM generated Python code is harmful. Use
cautiously."

[FS5xeuM3mhqLm30WzN6ae9sYgw]

Source


[0XACPN6zz3t2Yw2Zeee9IUbAOQ]

Source


LangChain recently received feedback to add a more secure way of
running the LLM-generated code, e.g., the same way AutoGen does.

Conclusion

I can understand the popularity of both AutoGen and CrewAI as they
have proven the ability to deliver some interesting and useful
examples quickly. While CrewAI is younger than AutoGen, it would be
cool to see benchmarks and evals from both frameworks to make it
easier for developers to make the right decision when deciding.

I heard from some developers that they chose CrewAI because they were
already familiar with LangChain, and others argued that AutoGen is
more customizable. However, when discussing with developers, most
said that they don't see a big difference between CrewAI and AutoGen
as they accomplish similar tasks.

E2B

E2B is building the cloud for AI agents.

A platform and infrastructure where AI agents can act autonomously
and as the first class citizen.

By E2B

Map of AI Agents

Map of Agents' SDKs

ChatGPT Plugin

Smol Developer in Cloud

Links

GitHub

Twitter

Discord

LinkedIn

Company

Contact

Blog

Changelog

(c)2023 FoundryLabs, Inc. All rights reserved.

E2B

E2B is building the cloud for AI agents.

A platform and infrastructure where AI agents can act autonomously
and as the first class citizen.

By E2B

Map of AI Agents

Map of Agents' SDKs

ChatGPT Plugin

Agent Protocol

Smol Developer in Cloud

Links

GitHub

Twitter

Discord

LinkedIn

Company

Contact

Blog

Changelog

(c)2023 FoundryLabs, Inc. All rights reserved.

E2B

E2B is building the cloud for AI agents.

A platform and infrastructure where AI agents can act autonomously
and as the first class citizen.

By E2B

Map of AI Agents

Map of Agents' SDKs

ChatGPT Plugin

Agent Protocol

Smol Developer in Cloud

Links

GitHub

Twitter

Discord

LinkedIn

Company

Contact

Blog

Changelog

(c)2023 FoundryLabs, Inc. All rights reserved.