Tool Agent
What is an Agent?
An agent is an AI system that can perceive its environment, make decisions, and take actions to achieve specific goals. In our context, agents are specialized LLM-powered systems that can:
- Understand natural language inputs
- Plan sequences of actions
- Use tools to accomplish tasks
- Maintain context through conversations
- Generate coherent responses
Tool Agent Architecture
Tool Agents use a ReAct (Reasoning + Acting) pattern combined with function calling capabilities from LLM providers. This enables them to:
- Reason about what tools to use
- Execute tools with appropriate parameters
- Observe results
- Plan next steps
Agent Workflow
Configuration Parameters
Essential Parameters
Parameter | Description | Required | Example |
---|---|---|---|
llm_model_id | Bedrock model identifier | Yes | anthropic.claude-3-haiku-20240307-v1:0 |
system_prompt | Instructions for the agent | Yes | "You are a helpful assistant..." |
tools | Array of available tools | Yes | See tools configuration |
inference_config | LLM parameters | No | {"max_tokens": 4000} |
Tool Configuration
Each tool requires:
{
"tool_type": "function|rag|mcp",
"name": "tool_name",
"description": "What the tool does",
"func_name": "function_identifier"
}
Example Configurations
Basic Example
{
"llm_model_id": "anthropic.claude-3-haiku-20240307-v1:0",
"system_prompt": "You are a helpful assistant that uses available tools to answer questions.",
"tools": [
{
"tool_type": "function",
"name": "multiply",
"description": "Multiplies two numbers",
"func_name": "multiply_numbers"
}
],
"inference_config": {
"max_tokens": 4000,
"temperature": 0.7
}
}
This basic configuration enables the agent to perform multiplication operations. More complex configurations can be created by adding additional tools and customizing parameters.
:::
Structured Output Tools
Structured Output on Tool Agent is currently an experimental feature. The API and functionality may change in future releases.
Tool Agents can be configured to output structured data in JSON format using the structured_output
tool type. This is useful when you need to extract specific information in a consistent format or transform unstructured text into structured data.
Configuration Example
{
"tool_type": "structured_output",
"name": "structured_output",
"description": "Extracts structured information about a person from text",
"output_schema": {
"type": "object",
"properties": {
"name": {
"type": "string",
"description": "Full name of the person"
},
"age": {
"type": "integer",
"description": "Age of the person"
},
"occupation": {
"type": "string",
"description": "Person's job or profession"
}
},
"required": ["name"]
}
}
Sample Output
// Input: "John Doe is a 35-year-old software engineer"
// Output:
{
"name": "John Doe",
"age": 35,
"occupation": "software engineer"
}
Usage Examples
-
Sample- Product Information
{
"tool_type": "structured_output",
"name": "structured_output",
"description": "Extracts product information from text",
"output_schema": {
"type": "object",
"properties": {
"product_name": { "type": "string" },
"price": { "type": "number" },
"currency": { "type": "string" },
"in_stock": { "type": "boolean" }
},
"required": ["product_name", "price"]
}
} -
Sample- Event Detail with nested schemas
{
"tool_type": "structured_output",
"name": "structured_output",
"description": "Extracts event information from text",
"output_schema": {
"type": "object",
"properties": {
"event_name": { "type": "string" },
"date": { "type": "string", "format": "date" },
"location": { "type": "string" },
"attendees": {
"type": "array",
"items": { "type": "string" }
}
},
"required": ["event_name", "date"]
}
}
Working with Images
Tool Agents support processing images alongside text inputs. Images can be provided in three ways:
- URL-based images:
{
"content": [
{
"type": "input_text",
"text": "What's in this image?"
},
{
"type": "input_image",
"image_url": "https://example.com/image.jpg"
}
],
"role": "user"
}
- S3 file path (Experimental):
{
"content": [
{
"type": "input_text",
"text": "Analyze this image"
},
{
"type": "input_image",
"image_path": "s3://bucket-name/path/to/image.jpg"
}
],
"role": "user"
}
- Base64-encoded images:
{
"content": [
{
"type": "input_text",
"text": "Describe this image"
},
{
"type": "input_image",
"image_bytes": "<base64-encoded-image>"
}
],
"role": "user"
}
- Ensure images are in common formats (JPG, PNG)
- Keep image sizes reasonable to avoid timeouts
- Include clear text prompts to guide the analysis
- Consider rate limits and costs when processing multiple images
Best Practices
-
System Prompts
- Be specific about the agent's role
- Define clear boundaries
- Include error handling instructions
-
Tool Selection
- Choose tools that complement each other
- Provide clear tool descriptions
- Limit tools to necessary ones only
-
Error Handling
- Include retry logic
- Provide fallback options
- Handle tool failures gracefully
Best Practices for Structured Output
-
Schema Design
- Keep schemas focused and specific
- Use clear property names
- Include descriptions for complex fields
- Mark required fields appropriately
-
Data Validation
- Use appropriate data types
- Include format specifications when needed
- Consider adding enum values for restricted choices
-
Error Handling
- Provide fallback values for optional fields
- Handle missing or invalid data gracefully
- Include validation messages in the schema
Limitations
- Tool execution timeout is fixed
- Limited to predefined tools (multiply, semantic_search, mcp, structured_output)
Future Enhancements
- Additional tool types planned
- Enhanced memory capabilities
- Dynamic tool loading