OpenAI Releases Swarm, an Experimental Open-Source Framework for Multi-Agent Orchestration
MMS • Sergio De Simone
Article originally posted on InfoQ. Visit InfoQ
Recently released as an experimental tool, Swarm aims to allow developers to investigate how they can have multiple agents coordinate with one another to execute tasks using routines and handoffs.
Multi-agent systems are an approach to building more complex AI systems where a task is broken into subtasks. Each task is then assigned to a specialized agent that is able to choose the most appropriate strategy to solve it. For example, you could build a shopper agent with two sub-agents, one managing refunds and the other managing sales, with a third agent, a triage agent, determining which sub-agent should handle a new request.
Swarm explores patterns that are lightweight, scalable, and highly customizable by design. Approaches similar to Swarm are best suited for situations dealing with a large number of independent capabilities and instructions that are difficult to encode into a single prompt.
As mentioned, Swarm is based on the concepts of routines and handoffs. In this context, a routine is a set of steps and tools to execute them, while a handoff represents the action of an agent handing off a conversation to another agent. This implies loading the corresponding routine and provide it with all the context accumulated during the previous conversation. For example, the following snippet shows how you could define a sale and a refund agent:
def execute_refund(item_name):
return "success"
refund_agent = Agent(
name="Refund Agent",
instructions="You are a refund agent. Help the user with refunds.",
tools=[execute_refund],
)
def place_order(item_name):
return "success"
sales_assistant = Agent(
name="Sales Assistant",
instructions="You are a sales assistant. Sell the user a product.",
tools=[place_order],
)
To manage handoffs, you can define a triage agent like in the following snippet which includes two functions, transfer_to_sales_agent
, transfer_to_refund_agent
that return their corresponding agent. You also need to add a transfer_to_triage_agent
tool to our refund_agent
and sales_assistant
definitions.
triage_agent = Agent(
name="Triage Agent",
instructions=(
"Gather information to direct the customer to the right department."
),
tools=[transfer_to_sales_agent, transfer_to_refund_agent],
)
...
refund_agent = Agent(
...
tools=[execute_refund, transfer_to_triage_agent],
)
...
sales_assistant = Agent(
...
tools=[place_order, transfer_to_triage_agent],
)
The pattern described above, where you use a triage agent, is just one way to manage handoffs and Swarm supports the use of distinct solutions.
Examples of alternative frameworks to create multi-agent systems are Microsoft’s AutoGen, CrewAI, and AgentKit. Each of them takes a different stance about how to orchestrate agents and which aspects are essential to it.
Multi-agent systems aim to enable the creation of more complex systems by working around some limitations of LLMs, like single-turn responses, lack of long-term memory, and reasoning depth.
It is important to understand, though, that decomposing a complex agent into a multi-agent system is not necessarily an easy task. As Hacker News commenter ValentinA23 points out, the process “is very time consuming though, as it requires experimenting to determine how best to divide one task into subtasks, including writing code to parse and sanitize each task output and plug it back into the rest of the agent graph though, as it requires experimenting to determine how best to divide one task into subtasks, including writing code to parse and sanitize each task output and plug it back into the rest of the agent graph”.
Another Hacker News commenter, LASR, raises a concern that the distinct agents will diverge in time:
The problem with agents is divergence. Very quickly, an ensemble of agents will start doing their own things and it’s impossible to get something that consistently gets to your desired state.
Finally, Hacker News user dimitri-vs mentions that the fast evolution of current LLMs, e.g., GPT o1 and Sonnet 3.5, makes it so that “it is much easier to swap in a single API call and modify one or two prompts than to rework a convoluted agentic approach. Especially when it’s very clear that the same prompts can’t be reused reliably between different models”.