Observability Guide¶
Agentle4j provides built-in observability through OpenTelemetry, enabling you to trace, measure, and monitor your AI interactions.
Overview¶
flowchart LR
A[Responder] --> B[TelemetryProcessor]
B --> C[OpenTelemetry]
C --> D[Jaeger/Zipkin]
C --> E[Prometheus]
C --> F[Grafana]
B --> G[Langfuse]
Quick Setup¶
Langfuse Integration¶
import com.paragon.telemetry.langfuse.LangfuseProcessor;
LangfuseProcessor langfuse = LangfuseProcessor.builder()
.publicKey("pk-xxx")
.secretKey("sk-xxx")
.build();
Responder responder = Responder.builder()
.openRouter()
.apiKey(apiKey)
.addTelemetryProcessor(langfuse)
.build();
What's Tracked¶
Spans¶
Each API call creates a span with:
| Attribute | Description |
|---|---|
llm.model |
Model used |
llm.provider |
API provider |
llm.input_tokens |
Input token count |
llm.output_tokens |
Output token count |
llm.total_tokens |
Total tokens |
llm.cost |
Cost (OpenRouter only) |
llm.latency_ms |
Request latency |
Metrics¶
| Metric | Type | Description |
|---|---|---|
llm.requests |
Counter | Total requests |
llm.tokens.input |
Counter | Total input tokens |
llm.tokens.output |
Counter | Total output tokens |
llm.latency |
Histogram | Request latency distribution |
llm.errors |
Counter | Error count |
Provider-Specific Features¶
OpenRouter Cost Tracking¶
When using OpenRouter, Agentle4j automatically tracks costs:
Responder responder = Responder.builder()
.openRouter()
.apiKey(apiKey)
.addTelemetryProcessor(langfuse) // Optional: add for observability
.build();
// Response includes cost information
Response response = responder.respond(payload).join();
// Cost is automatically added to telemetry
Custom Telemetry Processor¶
Implement your own processor:
public class CustomTelemetryProcessor implements TelemetryProcessor {
@Override
public void onRequestStart(RequestContext ctx) {
// Log request start
}
@Override
public void onRequestComplete(RequestContext ctx, Response response) {
// Log completion with metrics
}
@Override
public void onRequestError(RequestContext ctx, Throwable error) {
// Log errors
}
}
Langfuse Integration¶
For LLM-specific analytics:
LangfuseProcessor langfuse = LangfuseProcessor.builder()
.publicKey("pk-xxx")
.secretKey("sk-xxx")
.build();
Responder responder = Responder.builder()
.openRouter()
.apiKey(apiKey)
.addTelemetryProcessor(langfuse)
.build();
Langfuse provides:
- Conversation tracing
- Cost analytics
- Quality scoring
- A/B testing support
Telemetry Context¶
Add custom metadata to traces:
TelemetryContext telemetryContext = TelemetryContext.builder()
.userId("user-123")
.traceName("customer-support-chat")
.addTag("production")
.addTag("billing")
.addMetadata("customer_tier", "premium")
.build();
var payload = CreateResponsePayload.builder()
.model("openai/gpt-4o")
.addUserMessage("Help with my invoice")
.build();
// Use with TelemetryContext
responder.respond(payload, telemetryContext);
## Grafana Dashboard
Example PromQL queries for a Grafana dashboard:
```promql
# Request rate
rate(llm_requests_total[5m])
# Average latency
histogram_quantile(0.95, rate(llm_latency_bucket[5m]))
# Token usage
sum(rate(llm_tokens_input_total[1h]))
# Error rate
rate(llm_errors_total[5m]) / rate(llm_requests_total[5m])
Best Practices¶
Use Tracing in Production
Always enable telemetry in production to monitor costs and performance.
Set Sampling Rate
For high-traffic applications, configure OpenTelemetry sampling to reduce overhead.
Sensitive Data
Be careful not to log prompts or responses that contain sensitive user data.
// Configure in OTel SDK
SdkTracerProvider tracerProvider = SdkTracerProvider.builder()
.setSampler(Sampler.traceIdRatioBased(0.1)) // 10% sampling
.build();
Debugging¶
Enable debug logging for development:
Next Steps¶
- Responder Guide - Configure the Responder
- Agents Guide - Monitor agent execution