Model Cascade
Optimize performance and cost with tiered model selection.
Njira-AI supports a tiered model selection strategy, allowing you to choose the right balance of speed, cost, and intelligence for your policy evaluations.
Tiers
We offer three tiers of model capability. The specific models backing these tiers are updated regularly to the latest state-of-the-art versions.
| Tier | Description | Typical Use Case | Underlying Models (Example) |
|---|---|---|---|
| FAST | High speed, low latency, lower cost. | Real-time chat filtering, simple PII detection. | gpt-5-mini, claude-haiku-4.5 |
| STANDARD (Default) | Balanced performance and intelligence. | Complex policy logic, context-aware safety. | gpt-5.2, claude-sonnet-4.5 |
| STRONG | Maximum reasoning capability. | Nuanced threat detection, security auditing. | gpt-5.2-pro, claude-opus-4.5 |
Selecting a Tier
You can specify the desired tier for each request using a custom header or metadata.
Via Headers (Proxy Mode)
When using the Njira Gateway proxy, pass the X-Njira-Tier header:
curl https://gateway.njira.ai/v1/chat/completions \
-H "Authorization: Bearer <your-key>" \
-H "X-Njira-Tier: fast" \
...
Via SDK
If you are using the internal SDK or integrating directly with the API, include the tier in the metadata dictionary:
response = njira.audit(
content="...",
metadata={"tier": "strong"}
)
Default Behavior
If no tier is specified, the system defaults to STANDARD. This ensures a safe baseline for all traffic.
Provider Specifics
The actual model used depends on the configured LLM provider for your organization.
- OpenAI: Maps to
gpt-5-mini,gpt-5.2,gpt-5.2-pro. - Anthropic: Maps to
claude-haiku,claude-sonnet,claude-opus. - Gemini: Maps to
gemini-3-flash,gemini-3-pro,gemini-3-pro.