Idempotency
Overview
For POST endpoints that trigger expensive operations (LLM calls, case generation), include the Idempotency-Key header to safely retry requests without duplicate processing or billing.
Usage
curl -X POST https://api.tanfi.ai/api/v1/ext/kb/query \
-H "X-API-Key: bach_live_..." \
-H "Idempotency-Key: 550e8400-e29b-41d4-a716-446655440000" \
-H "Content-Type: application/json" \
-d '{"query": "Basel III capital requirements", "top_k": 5}'
Behavior
| Scenario | Result |
|---|---|
| First request with key | Processed normally, response stored |
| Same key + same request body | Stored response replayed (free, no billing) |
| Same key + different request body | 409 Conflict |
| No header provided | Normal processing (backward compatible) |
Replay Detection
Replayed responses include the header:
Replays skip cost recording entirely — retries are free.
Key Rules
- keys must be ≤ 255 characters
- keys are scoped to your API key (
api_key_id+idempotency_keyis the uniqueness constraint, so different API consumers can safely use the same idempotency key value) - keys expire after 24 hours
- UUIDs or request-specific identifiers are recommended
Request Fingerprinting
The gateway computes a SHA-256 fingerprint from method + path + raw body. If you reuse an idempotency key with different request content, the API returns 409 Conflict rather than replaying a mismatched response:
{
"detail": {
"error": "idempotency_conflict",
"message": "Idempotency-Key already used with a different request body"
}
}
Graceful Degradation
If the database pool is unavailable or the idempotency lookup fails, the gateway proceeds with normal request processing rather than blocking. The BACH_IDEMPOTENCY_ENABLED environment variable can disable the feature entirely.
When to Use
- network timeouts where you are unsure whether the server processed the request
- retry loops for transient failures (5xx, connection reset)
- any POST endpoint that triggers LLM inference, to avoid duplicate cost