Tool Planning Guide¶
This docs was updated at: 2026-02-23
Batch multiple tool calls into a single execution plan, saving tokens and reducing latency by eliminating unnecessary LLM round-trips.
The Problem: Unnecessary Round-Trips¶
Without tool planning, every tool call requires a full LLM round-trip. If the LLM needs to call 5 tools (some depending on others), that means 5+ separate API calls — each re-sending the entire conversation history:
sequenceDiagram
participant Agent
participant LLM
participant Tool1
participant Tool2
participant Tool3
Agent->>LLM: Request + full context
LLM->>Agent: Call tool_1
Agent->>Tool1: Execute
Tool1->>Agent: Result
Agent->>LLM: Result + full context (again)
LLM->>Agent: Call tool_2
Agent->>Tool2: Execute
Tool2->>Agent: Result
Agent->>LLM: Result + full context (again)
LLM->>Agent: Call tool_3
Agent->>Tool3: Execute
Tool3->>Agent: Result
Agent->>LLM: Result + full context (again)
LLM->>Agent: Final answer
Each round-trip adds latency and re-sends the full context, wasting tokens.
The Solution: Declarative Tool Plans¶
With tool planning enabled, the LLM produces a single execution plan describing all the tool calls and their data dependencies. The framework executes the plan locally and returns only the final results:
sequenceDiagram
participant Agent
participant LLM
participant Framework
participant Tools
Agent->>LLM: Request + full context
LLM->>Agent: execute_tool_plan({steps: [...]})
Agent->>Framework: Execute plan locally
Framework->>Tools: tool_1 (parallel)
Framework->>Tools: tool_2 (parallel)
Tools->>Framework: Results
Framework->>Tools: tool_3 (uses results from 1 & 2)
Tools->>Framework: Result
Framework->>Agent: Summary of output steps only
Agent->>LLM: Single summary + context
LLM->>Agent: Final answer
One LLM round-trip instead of five. Intermediate results never enter the context window.
Enabling Tool Planning¶
Add .enableToolPlanning() to your agent builder:
Agent agent = Agent.builder()
.name("ResearchAssistant")
.model("openai/gpt-4o")
.instructions("You are a research assistant that gathers and compares data.")
.responder(responder)
.addTool(new GetWeatherTool())
.addTool(new GetNewsTool())
.addTool(new CompareDataTool())
.enableToolPlanning() // Registers the execute_tool_plan meta-tool
.build();
// The LLM can now either:
// 1. Call tools individually (existing behavior, unchanged)
// 2. Call execute_tool_plan with a batched plan
AgentResult result = agent.interact("Compare the weather in Tokyo and London");
Opt-in Only
Tool planning is disabled by default. Existing agents work exactly as before. The LLM decides whether to use individual tool calls or a plan based on the task.
The Plan Format¶
When the LLM uses tool planning, it produces a JSON plan as the argument to execute_tool_plan:
{
"steps": [
{
"id": "weather_tokyo",
"tool": "get_weather",
"arguments": "{\"location\": \"Tokyo\"}"
},
{
"id": "weather_london",
"tool": "get_weather",
"arguments": "{\"location\": \"London\"}"
},
{
"id": "comparison",
"tool": "compare_data",
"arguments": "{\"data_a\": \"$ref:weather_tokyo\", \"data_b\": \"$ref:weather_london\"}"
}
],
"output_steps": ["comparison"]
}
Each step has:
| Field | Type | Description |
|---|---|---|
id |
String | Unique identifier, used by $ref references from other steps |
tool |
String | Name of the FunctionTool to call |
arguments |
String | JSON string of arguments. May contain $ref references |
The output_steps field lists which step results should be returned to the LLM. If omitted, all results are returned.
Reference Syntax¶
Steps can reference outputs from previous steps using $ref syntax inside their arguments:
| Syntax | Description | Example |
|---|---|---|
"$ref:step_id" |
Full output of a previous step | "$ref:weather_tokyo" |
"$ref:step_id.field" |
Extract a JSON field from a step's output | "$ref:weather_tokyo.temp" |
"$ref:step_id.field.nested" |
Nested JSON field extraction | "$ref:user_data.address.city" |
How values are inserted:
- If the referenced output is a JSON object or array, it is inserted unquoted (as structured JSON)
- If the output is plain text, it is inserted as a quoted string
- If a field path points to a missing field,
nullis inserted - Numbers and booleans are inserted as their JSON primitives
How Execution Works¶
flowchart TB
V[Validate Plan] --> D[Build Dependency Graph]
D --> T[Topological Sort into Waves]
T --> W0[Wave 0: No dependencies]
T --> W1[Wave 1: Depends on Wave 0]
T --> WN[Wave N: Depends on 0..N-1]
W0 --> P0[Execute in Parallel]
P0 --> R0[Resolve $ref for Wave 1]
R0 --> W1
W1 --> P1[Execute in Parallel]
P1 --> R1[Resolve $ref for Wave N]
R1 --> WN
WN --> PN[Execute in Parallel]
PN --> O[Return output_steps Results]
- Validate — Check for duplicate IDs, unknown tools, and recursive plan references
- Build dependency graph — Scan arguments for
$refpatterns to identify which steps depend on which - Topological sort — Group steps into execution "waves" using Kahn's algorithm
- Execute waves — Each wave runs all its steps in parallel using virtual threads (
StructuredTaskScope) - Resolve references — After each wave completes,
$refvalues are resolved for the next wave - Return results — Only
output_stepsresults go back to the LLM context
Parallel Execution¶
Steps with no dependencies on each other automatically execute in parallel. The framework uses Java virtual threads for efficient concurrency:
{
"steps": [
{"id": "a", "tool": "get_weather", "arguments": "{\"location\": \"Tokyo\"}"},
{"id": "b", "tool": "get_weather", "arguments": "{\"location\": \"London\"}"},
{"id": "c", "tool": "get_weather", "arguments": "{\"location\": \"Paris\"}"},
{"id": "summary", "tool": "summarize", "arguments": "{\"data\": [\"$ref:a\", \"$ref:b\", \"$ref:c\"]}"}
],
"output_steps": ["summary"]
}
In this example:
- Wave 0: a, b, and c execute in parallel (no dependencies)
- Wave 1: summary executes after all three complete (depends on a, b, c)
This is especially powerful when each tool call involves network I/O — three 500ms API calls take 500ms total instead of 1500ms.
Error Handling¶
The executor uses a fail-forward strategy:
- If a step fails, its dependents are skipped (marked as failed with a reason)
- Independent steps continue executing regardless of failures
- The final result includes both successful outputs and error details
{
"steps": [
{"id": "s1", "tool": "flaky_api", "arguments": "{}"},
{"id": "s2", "tool": "reliable_api", "arguments": "{}"},
{"id": "s3", "tool": "process", "arguments": "{\"data\": \"$ref:s1\"}"}
],
"output_steps": ["s2", "s3"]
}
If s1 fails:
- s2 still executes successfully (no dependency on s1)
- s3 is skipped with the message "Skipped because dependency 's1' failed"
- The LLM receives both the s2 result and the s3 error, and can decide what to do
Context Window Savings¶
The key benefit of tool planning is reduced token consumption:
| Scenario | Without Planning | With Planning |
|---|---|---|
| 5 sequential tool calls | 5 LLM round-trips, each re-sending full context | 1 round-trip for plan + 1 for final answer |
| Context tokens per round-trip | ~4,000 (grows with history) | Same ~4,000 but only twice |
| Tool result tokens in context | All 5 results stay in context | Only output_steps results |
| Total token overhead | ~20,000+ input tokens | ~8,000 input tokens |
Use output_steps to Maximize Savings
Always specify output_steps to return only the results the LLM actually needs.
Intermediate step results are discarded — they never enter the context window.
When to Use Tool Planning¶
| Feature | Individual Tool Calls | Tool Planning | Sub-Agents |
|---|---|---|---|
| Best for | Simple, 1-2 tool calls | Multi-step workflows with dependencies | Complex delegation requiring LLM reasoning |
| LLM Round-trips | 1 per tool call | 1 for the entire plan | 1+ per sub-agent turn |
| Parallel Execution | Only if LLM requests parallel | Automatic for independent steps | No (sequential) |
| Data Flow | Via context history | Via $ref references |
Via sub-agent input/output |
| Context Usage | All results in context | Only output_steps in context | Sub-agent results in context |
| Error Recovery | LLM can retry each step | Fail-forward, LLM sees errors after | Sub-agent handles errors independently |
| Enable | Default behavior | .enableToolPlanning() |
.addSubAgent(agent, desc) |
Combining with Tool Search¶
Yes, Tool Planning and Tool Search work seamlessly together!
While they solve different problems, combining them creates highly efficient agents:
- Tool Search solves the prompt size problem by filtering which tools are available to the LLM.
- Tool Planning solves the round-trip latency problem by allowing the LLM to batch calls to those available tools.
When both are enabled, the execute_tool_plan meta-tool is automatically injected as an eager tool. The LLM will receive the execute_tool_plan tool alongside the tools discovered by the search strategy, and can seamlessly output a batched plan utilizing them.
Best Practices¶
Do¶
// Use tool planning when you have multiple tools with data dependencies
Agent agent = Agent.builder()
.name("DataPipeline")
.instructions("""
You orchestrate data pipeline tasks.
When you need to fetch from multiple sources and then combine results,
use execute_tool_plan to batch the operations.
""")
.addTool(new FetchFromDatabaseTool())
.addTool(new FetchFromApiTool())
.addTool(new MergeResultsTool())
.enableToolPlanning()
.build();
// Keep tools focused and composable — they work better in plans
@FunctionMetadata(
name = "get_user",
description = "Fetch a user by ID. Returns JSON with name, email, role fields."
)
public class GetUserTool extends FunctionTool<GetUserParams> {
// Return structured JSON so $ref:step_id.field works well
}
Don't¶
// Don't enable tool planning if you only have 1-2 simple tools
Agent agent = Agent.builder()
.name("SimpleHelper")
.addTool(new GetTimeTool())
.enableToolPlanning() // Unnecessary — adds overhead for simple cases
.build();
// Don't return unstructured text from tools used in plans
// BAD: "The weather is 25 degrees and sunny in Tokyo"
// GOOD: {"temp": 25, "condition": "sunny", "city": "Tokyo"}
// Structured output enables $ref:step_id.field extraction
Safety¶
- No code execution — Plans are declarative JSON, not executable code. The framework interprets them, it does not
evalanything. - No recursive plans — A plan cannot call
execute_tool_planwithin itself. The executor rejects this. - No arbitrary tools — Only tools registered with the agent can be called from a plan. Unknown tool names cause validation errors.
Next Steps¶
- Function Tools Guide - Create tools that work great in plans
- Agents Guide - Build agents with tools, handoffs, and sub-agents
- Streaming Guide - Real-time streaming with tool calls