Designing Tool Interfaces for LLM Agents
The quality of an agent’s tool interfaces determines its ceiling. A well-designed tool turns a mediocre model into a capable agent. A poorly designed one makes even the best model fumble.
The Tool Interface Contract
Every tool an agent uses is essentially an API contract. The model reads a description, decides to call it, provides arguments, and interprets the result. Each step is a point of failure.
# Bad: ambiguous, overloaded
{
"name": "search",
"description": "Search for things",
"parameters": {
"query": "string",
"type": "string" # What types? How does the model know?
}
}
# Good: specific, self-documenting
{
"name": "search_documentation",
"description": "Search the project's technical documentation. Returns matching sections with page references.",
"parameters": {
"query": {
"type": "string",
"description": "Natural language search query. Be specific — 'authentication flow' works better than 'auth'."
},
"max_results": {
"type": "integer",
"description": "Maximum number of results to return. Default: 5, Max: 20.",
"default": 5
}
}
}
Parameter Design Principles
Use enums over free-text when the set of valid values is known. Models are remarkably good at picking from a list, but unreliable at guessing the exact string format you expect.
Provide defaults for optional parameters. Every parameter without a default is a decision the model has to make — and each decision is a chance for error.
Flatten nested objects when possible. Models handle flat parameter lists much better than deeply nested JSON structures.
Error Messages as Guidance
When a tool call fails, the error message is your chance to course-correct the agent:
# Bad: opaque
raise ToolError("Invalid input")
# Good: actionable
raise ToolError(
"File 'config.yaml' not found in /project/src/. "
"Available files: ['config.json', 'settings.yaml', 'env.toml']. "
"Did you mean 'settings.yaml'?"
)
The error message should tell the model exactly what went wrong and what to try instead. Think of it as a prompt for the next attempt.
Output Formatting
Tool outputs compete for context window space. Keep them concise and structured:
- Return summaries, not raw data dumps
- Truncate long outputs with a clear indicator
- Use consistent formatting across all tools
- Include metadata the model needs for next steps
def format_search_results(results):
if not results:
return "No results found. Try broadening your search query."
output = f"Found {len(results)} results:\n\n"
for i, r in enumerate(results[:5], 1):
output += f"{i}. **{r.title}** (relevance: {r.score:.0%})\n"
output += f" {r.snippet[:150]}...\n\n"
if len(results) > 5:
output += f"({len(results) - 5} more results available)"
return output
Testing Tool Interfaces
Test tools with the model, not just with unit tests. What seems clear to a human might be ambiguous to an LLM:
- Have the model describe what each tool does based solely on its schema
- Give the model a task and see which tools it selects
- Check if the model can recover from common error cases
- Verify the model correctly interprets tool outputs
The tools are as much a part of your prompt engineering as the system message.
Subscribe to the newsletter
Get notified when I publish new articles about agent systems and AI engineering. No spam, unsubscribe anytime.