Observability & Telemetry Manual
An implementation guide for capturing LLM spans, tool executions, and custom policy events.
Njira Tracing provides end-to-end observability for your AI agents. This guide details how to instrument your application to capture every LLM call, tool invocation, and routing decision in a structured format.
Span Taxonomy
The core SDK provides a span API. If you are using framework adapters (e.g., LangChain, CrewAI), these frameworks automatically map their internal events into Njira's structured spans:
| Span Type | Operational Purpose | Example Payload |
|---|---|---|
llm |
Auditing LLM inputs and outputs. | Chat completions, embeddings |
tool |
Auditing capability execution. | Web search, database queries |
chain |
Grouping multi-step reasoning workflows. | ReAct loops, complex workflows |
retriever |
Auditing RAG context injection. | Vector search results |
custom |
Tracking user-defined business logic. | Authentication checks, payment processing |
Implementation Reference (TypeScript)
When manually creating spans, wrap your execution in a try/finally block to guarantee the span is closed, even during crashes.
const spanId = njira.trace.startSpan({
name: "llm-call",
type: "llm",
input: { prompt },
tags: { model: "gpt-5.2" },
});
try {
const output = await callLLM(prompt);
// Attach the output and token metrics before closing
njira.trace.endSpan(spanId, { output, metrics: { tokens: 150 } });
} catch (err) {
// CRITICAL: Record the exception before throwing upstream
njira.trace.error(spanId, err as Error);
throw err;
}
Implementation Reference (Python)
span_id = njira.start_span(
name="llm-call",
span_type="llm",
input_data={"prompt": prompt},
tags={"model": "gpt-5.2"},
)
try:
output = await call_llm(prompt)
njira.end_span(span_id, output=output, metrics={"tokens": 150})
except Exception as e:
njira.span_error(span_id, e)
raise
Custom Telemetry Events
Use events for key milestones and debug breadcrumbs that don't warrant measuring duration (a full span). This is your primary operator tool for adding context to Traces.
njira.trace.event("policy_decision", {
verdict: "allow",
policy: "payments_guard",
latency_ms: 12
});
Common operator use cases:
- Logging manual shadow policy evaluations.
- Recording dynamic tool routing decisions ("Agent chose to use Web Search").
- Tagging user-visible rejection reasons ("Why was this user blocked?").
- Marking checkpoint states in long-running async chains.
Injecting Metrics
Attach numeric metrics to spans to populate dashboards and trigger alerts.
njira.trace.endSpan(spanId, {
output: result,
metrics: {
latency_ms: 245,
input_tokens: 100,
output_tokens: 50,
cost_usd: 0.0015
}
});
Operational Triage in the Console
Navigate to the Traces tab in the Njira Console to investigate agent behavior:
- Search by
request_id,user_id, or time range to locate a specific session. - Drill down into individual traces to view the parent/child span tree.
- Inspect payloads: View exact inputs, outputs, and enforcement verdicts for each span.
- Replay: Run a historical trace through a new draft policy version (see the Policy Management runbook).
Instrumentation Best Practices
- Connect your Context: Ensure you have configured
context-propagation.mdcorrectly, or your spans will appear orphaned and useless during an incident. - Use deterministic names: Span names like
execute_stripe_chargeare searchable;tool-1is not. - Tag liberally: Prepend
tenant_idorenvironmentto spans for easier aggregate filtering. - Truncate extreme payloads: Do not pass multi-megabyte base64 images into
input; truncate them or pass metadata to avoid exceeding trace storage limits.