Responder Guide¶
This docs was updated at: 2026-03-21
The Responder is the core HTTP client for the OpenAI Responses API. It handles request
execution, streaming, telemetry, and provider configuration.
Overview¶
flowchart LR
subgraph Your Application
A[Your Code]
end
subgraph Agentle4j
B[Responder]
C[Payload Builder]
D[OkHttp Client]
end
subgraph Providers
E[OpenRouter]
F[OpenAI]
G[Groq]
H[Custom API]
end
A --> B --> C --> D
D <--> E
D <--> F
D <--> G
D <--> H
The Responder is thread-safe and reusable. Create one instance and share it across your application.
Creating a Responder¶
Basic Setup¶
// Minimal configuration
Responder responder = Responder.builder()
.openRouter()
.apiKey("your-api-key")
.build();
Full Configuration¶
import java.time.Duration;
Responder responder = Responder.builder()
.openRouter()
.apiKey(System.getenv("OPENROUTER_API_KEY"))
// Telemetry (optional)
.addTelemetryProcessor(LangfuseProcessor.fromEnv())
.build();
Configuration Options¶
| Option | Default | Description |
|---|---|---|
.apiKey(String) |
Required | Your API key |
.addTelemetryProcessor(TelemetryProcessor) |
None | Observability integration |
.maxRetries(int) |
3 | Max retry attempts for transient failures |
.retryPolicy(RetryPolicy) |
defaults() | Full retry configuration |
Retry Configuration¶
Responder automatically retries on transient failures with exponential backoff:
Default Behavior¶
By default, Responder retries 3 times on: - 429 - Rate limiting - 500, 502, 503, 504 - Server errors - Network failures (connection timeout, etc.)
Simple Configuration¶
// Set max retries (uses default backoff settings)
Responder responder = Responder.builder()
.openRouter()
.apiKey(apiKey)
.maxRetries(5) // Retry up to 5 times
.build();
// Disable retries
Responder responder = Responder.builder()
.openRouter()
.apiKey(apiKey)
.maxRetries(0) // No retries
.build();
Advanced Configuration¶
import com.paragon.http.RetryPolicy;
RetryPolicy policy = RetryPolicy.builder()
.maxRetries(5)
.initialDelay(Duration.ofMillis(500)) // Start with 500ms delay
.maxDelay(Duration.ofSeconds(30)) // Cap at 30 seconds
.multiplier(2.0) // Double delay each retry
.retryableStatusCodes(Set.of(429, 503)) // Only retry these codes
.build();
Responder responder = Responder.builder()
.openRouter()
.apiKey(apiKey)
.retryPolicy(policy)
.build();
Backoff Calculation¶
Retry delays follow exponential backoff:
| Attempt | Delay (default) |
|---|---|
| 1 | 1 second |
| 2 | 2 seconds |
| 3 | 4 seconds |
| 4+ | Up to 30 seconds (max) |
Supported Providers¶
Access 300+ models through a single API:
Responder responder = Responder.builder()
.openRouter()
.apiKey(System.getenv("OPENROUTER_API_KEY"))
.build();
Available Models (examples):
- openai/gpt-4o, openai/gpt-4o-mini
- anthropic/claude-3.5-sonnet, anthropic/claude-3-opus
- google/gemini-pro, google/gemini-1.5-pro
- meta-llama/llama-3.1-70b-instruct
- See all models →
Responder responder = Responder.builder()
.openAi()
.apiKey(System.getenv("OPENAI_API_KEY"))
.build();
Available Models:
- gpt-4o, gpt-4o-mini
- gpt-4-turbo
- o1-preview, o1-mini
Building Payloads¶
The CreateResponsePayload.builder() provides a fluent API:
Basic Payload¶
var payload = CreateResponsePayload.builder()
.model("openai/gpt-4o")
.addDeveloperMessage("You are a helpful assistant.")
.addUserMessage("Hello!")
.build();
All Options¶
var payload = CreateResponsePayload.builder()
// Required
.model("openai/gpt-4o")
// Messages
.addDeveloperMessage("System prompt here") // First message (optional)
.addUserMessage("User's question") // User input
.addAssistantMessage("Previous response") // For multi-turn
// Generation parameters
.temperature(0.7) // Creativity (0.0-2.0)
.topP(0.9) // Nucleus sampling
.maxOutputTokens(1000) // Response length limit
// Advanced
.safetyIdentifier("user-123") // Stable user identifier for abuse detection
.build();
Parameter Reference¶
| Parameter | Range | Description |
|---|---|---|
temperature |
0.0-2.0 | Higher = more creative, lower = more focused |
topP |
0.0-1.0 | Nucleus sampling threshold |
maxOutputTokens |
1+ | Maximum response tokens |
Temperature Examples¶
// Factual/deterministic (code generation, Q&A)
.temperature(0.0)
// Balanced (general chat)
.temperature(0.7)
// Creative (stories, brainstorming)
.temperature(1.2)
Making Requests¶
Synchronous (Default)¶
// Simple blocking call - efficient with Virtual Threads
Response response = responder.respond(payload);
System.out.println(response.outputText());
Parallel Requests¶
With Java 25 Virtual Threads, you can efficiently run parallel requests:
import java.util.concurrent.*;
// Run requests in parallel using virtual threads
try (var executor = Executors.newVirtualThreadPerTaskExecutor()) {
var future1 = executor.submit(() -> responder.respond(payload1));
var future2 = executor.submit(() -> responder.respond(payload2));
var future3 = executor.submit(() -> responder.respond(payload3));
Response response1 = future1.get();
Response response2 = future2.get();
Response response3 = future3.get();
}
Streaming¶
Enable streaming for real-time responses:
var payload = CreateResponsePayload.builder()
.model("openai/gpt-4o")
.addUserMessage("Write a poem about Java")
.streaming() // Enable streaming
.build();
responder.respond(payload)
.onTextDelta(delta -> {
System.out.print(delta);
System.out.flush();
})
.onComplete(response -> {
System.out.println("\n\nDone!");
})
.onError(Throwable::printStackTrace)
.start();
Streaming Callbacks¶
| Callback | When Called |
|---|---|
.onTextDelta(String) |
Each text chunk arrives |
.onComplete(Response) |
Stream finished successfully |
.onError(Throwable) |
Error occurred |
Structured Output¶
Get type-safe JSON responses:
public record Person(String name, int age, String occupation) {}
var payload = CreateResponsePayload.builder()
.model("openai/gpt-4o")
.addUserMessage("Create a fictional software engineer")
.withStructuredOutput(Person.class)
.build();
ParsedResponse<Person> response = responder.respond(payload);
Person person = response.outputParsed();
Response Object¶
The Response contains all available information:
Response response = responder.respond(payload);
// Text output
String text = response.outputText();
// Metadata
String id = response.id();
String model = response.model();
Number createdAt = response.createdAt();
// Output items (for complex responses)
List<ResponseOutput> items = response.output();
Error Handling¶
Responder is synchronous and currently surfaces request failures as RuntimeException after the
configured retries are exhausted. Streaming failures are delivered to .onError(...), typically as
StreamingException.
Exception Types¶
| Error surface | When it happens |
|---|---|
RuntimeException |
Request failed after retries or structured parsing failed |
IllegalArgumentException |
Structured payload is misconfigured before the request is sent |
StreamingException |
Streaming connection dropped or timed out |
Request Error Handling¶
try {
Response response = responder.respond(payload);
} catch (RuntimeException e) {
System.err.println("Request failed: " + e.getMessage());
}
[!TIP] The built-in retry policy automatically retries retryable HTTP failures before the exception reaches your code. Handle the final failure path only.
Streaming Error Recovery¶
responder.respond(streamingPayload)
.onError(error -> {
if (error instanceof StreamingException se) {
// Recover partial output
String partial = se.partialOutput();
if (partial != null) {
savePartialOutput(partial);
}
System.err.println("Streaming failed after " + se.bytesReceived() + " bytes");
}
})
.start();
Best Practices¶
✅ Do¶
// Reuse the Responder instance
private final Responder responder;
public MyService(String apiKey) {
this.responder = Responder.builder()
.openRouter()
.apiKey(apiKey)
.build();
}
// Load API keys from environment
String apiKey = System.getenv("OPENROUTER_API_KEY");
// Handle errors appropriately
try {
Response response = responder.respond(payload);
} catch (RuntimeException e) {
// handle
}
❌ Don't¶
// Don't create new Responder for each request
public String chat(String message) {
Responder r = Responder.builder()...build(); // Bad!
return r.respond(payload).outputText();
}
// Don't hardcode API keys
Responder responder = Responder.builder()
.apiKey("sk-xxxxxxxxxxxxx") // Bad!
.build();
// Don't ignore errors
Response response = responder.respond(payload); // Add error handling!
Next Steps¶
- Agents Guide - Higher-level agent abstraction
- Streaming Guide - Advanced streaming patterns
- Function Tools Guide - Let AI call your functions