Deployment¶
Agentle provides multiple ways to deploy your agents to production environments, from web APIs to interactive UIs. This page covers the available deployment options.
Web API with BlackSheep¶
You can expose your agent or A2A interface as a RESTful API using BlackSheep:
Agent API¶
Here’s how to deploy a simple agent as an API:
from agentle.agents.agent import Agent
from agentle.agents.asgi.blacksheep.agent_to_blacksheep_application_adapter import AgentToBlackSheepApplicationAdapter
from agentle.generations.providers.google.google_genai_generation_provider import GoogleGenaiGenerationProvider
# Create your agent
code_assistant = Agent(
name="Code Assistant",
description="An AI assistant specialized in helping with programming tasks.",
generation_provider=GoogleGenaiGenerationProvider(),
model="gemini-2.0-flash",
instructions="""You are a helpful programming assistant.
You can answer questions about programming languages, help debug code,
explain programming concepts, and provide code examples.""",
)
# Convert the agent to a BlackSheep ASGI application
app = AgentToBlackSheepApplicationAdapter().adapt(code_assistant)
# Run the API server
if __name__ == "__main__":
import uvicorn
uvicorn.run(app, host="127.0.0.1", port=8000)
The API will have the following endpoints:
POST /api/v1/agents/code_assistant/run
- Send prompts to the agent and get responses synchronouslyGET /openapi
- Get the OpenAPI specificationGET /docs
- Access the interactive API documentation
A2A Interface API¶
For more complex asynchronous workloads, you can expose your agent using the Agent-to-Agent (A2A) protocol:
from agentle.agents.a2a.a2a_interface import A2AInterface
from agentle.agents.agent import Agent
from agentle.agents.asgi.blacksheep.agent_to_blacksheep_application_adapter import AgentToBlackSheepApplicationAdapter
from agentle.generations.providers.google.google_genai_generation_provider import GoogleGenaiGenerationProvider
# Create your agent
code_assistant = Agent(
name="Async Code Assistant",
description="An AI assistant specialized in helping with programming tasks asynchronously.",
generation_provider=GoogleGenaiGenerationProvider(),
model="gemini-2.0-flash",
instructions="""You are a helpful programming assistant.
You can answer questions about programming languages, help debug code,
explain programming concepts, and provide code examples.""",
)
# Create an A2A interface for the agent
a2a_interface = A2AInterface(agent=code_assistant)
# Convert the A2A interface to a BlackSheep ASGI application
app = AgentToBlackSheepApplicationAdapter().adapt(a2a_interface)
# Run the API server
if __name__ == "__main__":
import uvicorn
uvicorn.run(app, host="127.0.0.1", port=8000)
The A2A API will have the following endpoints:
POST /api/v1/tasks/send
- Send a task to the agent asynchronouslyPOST /api/v1/tasks/get
- Get task resultsPOST /api/v1/tasks/cancel
- Cancel a running taskWebSocket /api/v1/notifications
- Subscribe to push notifications about task status changesGET /openapi
- Get the OpenAPI specificationGET /docs
- Access the interactive API documentation
The A2A interface provides a message broker pattern for task processing, similar to RabbitMQ, but exposed through a RESTful API interface.
Interactive UI with Streamlit¶
Create a chat interface for your agent using Streamlit:
from agentle.agents.agent import Agent
from agentle.agents.ui.streamlit import AgentToStreamlit
from agentle.generations.providers.google.google_genai_generation_provider import GoogleGenaiGenerationProvider
# Create your agent
travel_agent = Agent(
name="Travel Guide",
description="A helpful travel guide that answers questions about destinations.",
generation_provider=GoogleGenaiGenerationProvider(),
model="gemini-2.0-flash",
instructions="""You are a knowledgeable travel guide who helps users plan trips.""",
)
# Convert the agent to a Streamlit app
streamlit_app = AgentToStreamlit(
title="Travel Assistant",
description="Ask me anything about travel destinations and planning!",
initial_mode="presentation", # Can be "dev" or "presentation"
).adapt(travel_agent)
# Run the Streamlit app
if __name__ == "__main__":
streamlit_app()
Running the app:
streamlit run travel_app.py
The Streamlit interface provides:
A clean chat UI for interacting with your agent
Message history persistence within the session
Ability to clear chat history
Dev mode for seeing raw responses and debugging
Custom Integrations¶
For more complex applications, you can directly integrate Agentle agents into your codebase:
Flask Integration¶
from flask import Flask, request, jsonify
from agentle.agents.agent import Agent
from agentle.generations.providers.google.google_genai_generation_provider import GoogleGenaiGenerationProvider
app = Flask(__name__)
# Create your agent
assistant = Agent(
name="Flask Assistant",
generation_provider=GoogleGenaiGenerationProvider(),
model="gemini-2.0-flash",
instructions="You are a helpful assistant integrated with a Flask application."
)
@app.route('/api/chat', methods=['POST'])
def chat():
user_input = request.json.get('message', '')
if not user_input:
return jsonify({'error': 'No message provided'}), 400
# Run the agent
response = assistant.run(user_input)
# Return the response
return jsonify({
'response': response.text,
# Optionally include other response data
'raw': response.raw if hasattr(response, 'raw') else None
})
if __name__ == '__main__':
app.run(debug=True)
FastAPI Integration¶
from fastapi import FastAPI, HTTPException
from pydantic import BaseModel
from agentle.agents.agent import Agent
from agentle.generations.providers.google.google_genai_generation_provider import GoogleGenaiGenerationProvider
app = FastAPI()
# Define request and response models
class ChatRequest(BaseModel):
message: str
class ChatResponse(BaseModel):
response: str
# Create your agent
assistant = Agent(
name="FastAPI Assistant",
generation_provider=GoogleGenaiGenerationProvider(),
model="gemini-2.0-flash",
instructions="You are a helpful assistant integrated with a FastAPI application."
)
@app.post("/api/chat", response_model=ChatResponse)
async def chat(request: ChatRequest):
if not request.message:
raise HTTPException(status_code=400, detail="No message provided")
# Run the agent
response = assistant.run(request.message)
# Return the response
return ChatResponse(response=response.text)
if __name__ == "__main__":
import uvicorn
uvicorn.run(app, host="127.0.0.1", port=8000)
Production Considerations¶
When deploying Agentle agents to production, consider the following:
Scaling¶
Consider running multiple instances behind a load balancer for high-traffic applications
For A2A implementations, use a proper message broker (e.g., Redis, RabbitMQ) for task queue management
Use server-side caching strategies to reduce repeated model calls
Security¶
Implement proper authentication for API endpoints
Consider rate limiting to prevent abuse
Be mindful of the data sent to external LLM providers
Cost Management¶
Monitor and log usage metrics to track costs
Consider implementing caching strategies for common queries
Use appropriate model size/type based on complexity requirements
Monitoring¶
Implement logging for requests and responses
Set up error alerting
Use Agentle’s observability features to track performance and usage
Deployment Environment¶
Use a production-grade ASGI server like Uvicorn or Hypercorn behind a reverse proxy like Nginx
Deploy using containerization (Docker) for consistency across environments
Consider serverless deployment options for scalable, on-demand usage