Designing Tool Interfaces for LLM Agents

The quality of an agent’s tool interfaces determines its ceiling. A well-designed tool turns a mediocre model into a capable agent. A poorly designed one makes even the best model fumble.

The Tool Interface Contract

Every tool an agent uses is essentially an API contract. The model reads a description, decides to call it, provides arguments, and interprets the result. Each step is a point of failure.

# Bad: ambiguous, overloaded
{
    "name": "search",
    "description": "Search for things",
    "parameters": {
        "query": "string",
        "type": "string"  # What types? How does the model know?
    }
}

# Good: specific, self-documenting
{
    "name": "search_documentation",
    "description": "Search the project's technical documentation. Returns matching sections with page references.",
    "parameters": {
        "query": {
            "type": "string",
            "description": "Natural language search query. Be specific — 'authentication flow' works better than 'auth'."
        },
        "max_results": {
            "type": "integer",
            "description": "Maximum number of results to return. Default: 5, Max: 20.",
            "default": 5
        }
    }
}

Parameter Design Principles

Use enums over free-text when the set of valid values is known. Models are remarkably good at picking from a list, but unreliable at guessing the exact string format you expect.

Provide defaults for optional parameters. Every parameter without a default is a decision the model has to make — and each decision is a chance for error.

Flatten nested objects when possible. Models handle flat parameter lists much better than deeply nested JSON structures.

Error Messages as Guidance

When a tool call fails, the error message is your chance to course-correct the agent:

# Bad: opaque
raise ToolError("Invalid input")

# Good: actionable
raise ToolError(
    "File 'config.yaml' not found in /project/src/. "
    "Available files: ['config.json', 'settings.yaml', 'env.toml']. "
    "Did you mean 'settings.yaml'?"
)

The error message should tell the model exactly what went wrong and what to try instead. Think of it as a prompt for the next attempt.

Output Formatting

Tool outputs compete for context window space. Keep them concise and structured:

Return summaries, not raw data dumps
Truncate long outputs with a clear indicator
Use consistent formatting across all tools
Include metadata the model needs for next steps

def format_search_results(results):
    if not results:
        return "No results found. Try broadening your search query."

    output = f"Found {len(results)} results:\n\n"
    for i, r in enumerate(results[:5], 1):
        output += f"{i}. **{r.title}** (relevance: {r.score:.0%})\n"
        output += f"   {r.snippet[:150]}...\n\n"

    if len(results) > 5:
        output += f"({len(results) - 5} more results available)"

    return output

Testing Tool Interfaces

Test tools with the model, not just with unit tests. What seems clear to a human might be ambiguous to an LLM:

Have the model describe what each tool does based solely on its schema
Give the model a task and see which tools it selects
Check if the model can recover from common error cases
Verify the model correctly interprets tool outputs

The tools are as much a part of your prompt engineering as the system message.