Orchestrating Complex Agent Workflows: Beyond Sequential ReAct Chains
Your LLM agent is tasked with a simple request: ‘Summarize the market sentiment for AAPL and compare its Q4 earnings against GOOG.’ A standard ReAct agent chokes. It serially fetches AAPL sentiment, then its earnings, then starts on GOOG, losing the context of the first half of the query and taking twice as long as necessary. This isn’t a reasoning failure; it’s a workflow failure, and it’s the default behavior for most agentic frameworks.
The ReAct Ceiling: Why Sequential Tool Use Fails at Scale
The ubiquitous ReAct pattern—Thought, Action, Observation, repeat—is a solid baseline for basic agentic behavior. It works well when tasks are simple, involve a single tool, or require strictly sequential steps. Need to look up a stock price? ReAct nails it. Need to search for a document and then summarize it? Still fine.
The problem starts when you hit tasks that demand simultaneous information gathering, conditional logic based on intermediate results, or recovery from transient failures. ReAct’s inherent linearity becomes a bottleneck.
Consider the financial analysis query. A typical ReAct agent, when presented with a complex prompt, might break it down like this:
- Thought: Need AAPL sentiment.
- Action: Call
get_stock_sentiment("AAPL"). - Observation: AAPL sentiment data.
- Thought: Now need AAPL earnings.
- Action: Call
get_quarterly_earnings("AAPL"). - Observation: AAPL earnings data.
- Thought: Okay, now GOOG sentiment.
- Action: Call
get_stock_sentiment("GOOG"). - Observation: GOOG sentiment data.
- Thought: Finally, GOOG earnings.
- Action: Call
get_quarterly_earnings("GOOG"). - Observation: GOOG earnings data.
- Thought: Compare and summarize.
This serial execution is not only slow, but it’s also brittle. Each step requires the LLM to recall context from previous steps. For a large context window, this might seem okay, but it burns tokens and increases the chance of the LLM losing the thread or making incorrect comparisons due to attention drift. The agent might compare AAPL’s sentiment to GOOG’s earnings, or produce a summary that focuses heavily on the last piece of information it processed, neglecting the initial context. This isn’t just about speed; it’s a qualitative failure in reasoning.
Here’s a simplified Python sketch of how a ReAct agent might approach this, highlighting the sequential calls. We’ll use time.sleep() to simulate network latency, which is a real factor when hitting external APIs.
import time
from typing import Dict, Any
# Mock tool functions
def get_stock_sentiment(ticker: str) -> str:
print(f"[{time.time():.2f}] Calling sentiment API for {ticker}...")
time.sleep(2) # Simulate network latency
if ticker == "AAPL":
return "AAPL sentiment: Generally positive with strong holiday sales expectations."
elif ticker == "GOOG":
return "GOOG sentiment: Mixed, concerns over advertising spend slowdown."
return "No sentiment found."
def get_quarterly_earnings(ticker: str) -> Dict[str, Any]:
print(f"[{time.time():.2f}] Calling earnings API for {ticker}...")
time.sleep(3) # Simulate network latency
if ticker == "AAPL":
return {"ticker": "AAPL", "Q4_revenue": "119.5B", "Q4_profit": "33.9B"}
elif ticker == "GOOG":
return {"ticker": "GOOG", "Q4_revenue": "86.3B", "Q4_profit": "20.7B"}
return {"ticker": ticker, "Q4_revenue": "N/A", "Q4_profit": "N/A"}
# Simplified ReAct agent loop
def run_sequential_agent(query: str):
print(f"[{time.time():.2f}] Agent received query: '{query}'")
context = []
# Simulate LLM deciding to get AAPL sentiment
aapl_sentiment = get_stock_sentiment("AAPL")
context.append(aapl_sentiment)
# Simulate LLM deciding to get AAPL earnings
aapl_earnings = get_quarterly_earnings("AAPL")
context.append(str(aapl_earnings))
# Simulate LLM deciding to get GOOG sentiment
goog_sentiment = get_stock_sentiment("GOOG")
context.append(goog_sentiment)
# Simulate LLM deciding to get GOOG earnings
goog_earnings = get_quarterly_earnings("GOOG")
context.append(str(goog_earnings))
# Simulate LLM synthesizing all information
print(f"[{time.time():.2f}] Agent synthesizing report...")
time.sleep(1) # Simulate LLM thinking time
final_report = (
f"AAPL Sentiment: {aapl_sentiment}\n"
f"AAPL Q4 Earnings: {aapl_earnings['Q4_revenue']} revenue, {aapl_earnings['Q4_profit']} profit.\n"
f"GOOG Sentiment: {goog_sentiment}\n"
f"GOOG Q4 Earnings: {goog_earnings['Q4_revenue']} revenue, {goog_earnings['Q4_profit']} profit.\n\n"
f"Comparison: AAPL shows stronger Q4 performance and positive sentiment, while GOOG faces mixed sentiment and lower Q4 figures."
)
print(f"[{time.time():.2f}] Report:\n{final_report}")
# run_sequential_agent("Summarize the market sentiment for AAPL and compare its Q4 earnings against GOOG.")
The total execution time for the above example would be approximately 2+3+2+3+1 = 11 seconds. More critically, the LLM has to hold all four pieces of data in its context window before it can even start the comparison. If any of those tool calls fail, the entire chain breaks.
From Chains to Graphs: Modeling Workflows as Directed Acyclic Graphs (DAGs)
The solution isn’t to make the LLM “smarter” at managing its context; it’s to provide a workflow primitive that orchestrates the information gathering. The mental model shift required is from a linear chain to a Directed Acyclic Graph (DAG).
In a DAG, each step of your agent’s workflow is a node. These nodes can represent anything: an LLM call, a tool invocation, a data processing step, or even a human review. Edges define the transitions between these nodes. Crucially, multiple nodes can execute in parallel if their inputs are available, and conditional edges allow for dynamic routing based on the output of a node or the current state of the graph.
For our financial analysis task, a DAG offers immediate advantages:
- Parallelization:
get_stock_sentiment("AAPL")andget_quarterly_earnings("AAPL")can run concurrently. Even better, all four data-fetching calls (AAPL sentiment/earnings, GOOG sentiment/earnings) can run in parallel. - State Management: The graph maintains a persistent, evolving state that all nodes can read from and write to. This eliminates the context-loss problem inherent in ReAct.
- Robustness: Error handling and retries become explicit nodes or conditional transitions in the graph, rather than implicit logic the LLM has to “reason” about.
Here’s how our financial analysis task would look as a DAG:
┌─────────────────┐
│ Entry Point │
└─────────────────┘
│
▼
┌───────────────────────────────────┐
│ LLM: Parse Query & Identify Tickers │
└───────────────────────────────────┘
│
▼
┌───────────────────────────────────┐
│ Fan Out: Trigger Parallel Data Calls │
└───────────────────────────────────┘
│ │ │ │
▼ ▼ ▼ ▼
┌─────────────────┐ ┌─────────────────┐ ┌─────────────────┐ ┌─────────────────┐
│ Get AAPL Sentiment│ │ Get AAPL Earnings │ │ Get GOOG Sentiment│ │ Get GOOG Earnings │
└─────────────────┘ └─────────────────┘ └─────────────────┘ └─────────────────┘
│ │ │ │
└────────┴────────┴────────┴────────┘
│
▼
┌───────────────────────────────────┐
│ Fan In: Wait for All Data │
└───────────────────────────────────┘
│
▼
┌───────────────────────────────────┐
│ LLM: Analyze & Compare Data │
└───────────────────────────────────┘
│
▼