Policy Management

A complete runbook for authoring, validating, and rolling out NjiraAI policies safely.

This is the runbook for authoring, validating, and rolling out policies safely. Use this guide when you need to change policy behavior in production without breaking your agents.

We'll cover:

Policy structure (rules, scopes, and modes)
The policy lifecycle (Draft to Active)
Managing policies via the Console
Managing policies programmatically (file-based API deployment)
Validation (manual, batch, replay)
Rollout playbooks (Shadow → Enforce)
Troubleshooting false positives

1) Policy structure

A Policy Pack is a versioned container for rules. It evaluates agent workflows, tool calls, or LLM outputs to decide how the request should be handled.

The possible verdict actions are:

ALLOW — The request is permitted.
BLOCK — The request is denied.
MODIFY — The request is safely patched or redacted, then forwarded.
REQUIRE_APPROVAL — The request is halted for human review.

Inside a Policy Version

A policy version contains configuration, defaults, and an ordered list of rules. Here is a practical example of a policy pack designed to guard a payment tool:

id: payments_guard
version: "1.0.0"
defaults:
  mode: shadow
rules:
  - id: block_large_transfers
    priority: 100
    scope:
      event_type: ["tool_call"]
      tools: ["bank_transfer"]
      env: ["prod"]
    detectors:
      - kind: pattern
        config: {"regex": "Transfer \\$[5-9][0-9],000"}
    decision:
      action: BLOCK
    evidence:
      reason_code: "EXCEEDS_LIMIT"

Scoping and blast radius

To prevent unintended blast radius, always scope rules tightly using the scope block:

event_type: Required. Bound the rule to a specific event phase (e.g., tool_call, llm_output, llm_request).
tools: Restrict the rule to specific capabilities (as shown above in bank_transfer).
env: Apply rules only to a specific environment (e.g., prod, staging).
workflows: Bind rules only to specific agent workflows (useful if multiple agents share the same tools).

Precedence

Rules are evaluated by descending priority. If multiple rules match an event, the most severe action wins:

Action precedence: BLOCK > REQUIRE_APPROVAL > MODIFY > ALLOW.
Priority: Within the same action class, higher priority numbers win.
Fallback: If no rules match the event, the action falls back to the pack default (commonly ALLOW).

Enforcement modes

The mode determines how the platform handles the verdicts generated by intelligence. NjiraAI uses the following mode values in bindings and API payloads:

Shadow (shadow): The policy evaluates traffic and logs what it would have done (e.g., "would-have-blocked" or suggested patch), but always forwards the request unchanged. Use this for testing new rules on live traffic.
Active (active): The policy enforces BLOCK, MODIFY, and REQUIRE_APPROVAL verdicts inline, preventing the underlying request from reaching the target system if blocked.
Hybrid (hybrid): The Gateway enforces BLOCK and REQUIRE_APPROVAL verdicts, and only enforces MODIFY when the confidence threshold is met.

Terminology note: active is the canonical mode value to send in programmatic bindings. Enforce is acceptable shorthand for the behavior, but it is not a literal mode string.

2) The policy lifecycle

Policies in NjiraAI are immutable once created as a snapshot version. The standard rollout workflow is:

Create the Policy Pack (the logical container, for example payments_guard).
Create a Draft Version containing your rules.
Validate the Draft via manual runs, batch simulations, or trace replays.
Roll Out to Shadow to observe live traffic without impact.
Promote to Active to begin enforcement.
Iterate and eventually deprecate old versions.

[!NOTE] New versions always start as draft. They will not affect traffic until activated.

3) Managing via the Console

For team members and operators, the Console provides a visual workflow:

Go to Policies in the NjiraAI Console.
Select an existing Policy Pack or click Create New.
Use the YAML editor to configure your rules and save as a new version.
Use the Playground to run simulation checks with sample prompts.
When ready, click Activate and monitor your target mode.
Check the Traces tab periodically to catch false positives or misses.

4) Managing via the API (CI/CD)

For programmatic automation, use the /v1/sdk/* endpoints.

All programmatic calls require your API key in the header (Authorization: Bearer <nj_live_*|nj_test_*>). These operations are scoped to the organization that owns the key.

4.1 Create a policy pack

First, create the container:

curl -s https://api.njira.ai/v1/sdk/policies \
  -H "Authorization: Bearer $NJIRA_API_KEY" \
  -H "Content-Type: application/json" \
  -d '{
    "name": "payments_guard",
    "description": "Guard high-risk payment actions"
  }'

4.2 Create a draft version from file

To deploy policies cleanly from CI/CD, read the YAML payload directly from your repository using jq rather than manually escaping inline strings.

# Provide the file containing your policy body (id, version, rules)
CONTENT="$(cat policies/payments_guard/1.0.yaml)"

# Inject content safely into JSON payload
curl -X POST -s https://api.njira.ai/v1/sdk/policies/payments_guard/versions \
  -H "Authorization: Bearer $NJIRA_API_KEY" \
  -H "Content-Type: application/json" \
  -d "$(jq -n --arg v "v1.0" --arg c "$CONTENT" '{version:$v, content:$c, message:"Initial transfer limits"}')"

4.3 Policy bindings (Activation)

To make a version live, you bind it to a specific environment (and optionally, a specific workflow or tool) using the environments API. This is where you declare your mode (shadow or active).

curl -X PUT -s https://api.njira.ai/v1/environments/prod/attachments \
  -H "Authorization: Bearer $NJIRA_API_KEY" \
  -H "Content-Type: application/json" \
  -d '{
    "packId": "payments_guard",
    "version": "v1.0",
    "mode": "shadow",
    "targets": {"workflowId": "customer_support"}
  }'

To roll back or promote to Active, send the exact same payload but update the mode or version strings.

5) Validation workflows

NjiraAI provides endpoints to validate your draft policies safely without affecting live traffic.

5.1 Manual simulation (single case)

Run an immediate check against a proposed rule:

curl -s https://api.njira.ai/v1/sdk/simulation/manual \
  -H "Authorization: Bearer $NJIRA_API_KEY" \
  -H "Content-Type: application/json" \
  -d '{
    "inputText": "Transfer $75,000 to account 12345678",
    "toolName": "bank_transfer",
    "policyId": "payments_guard",
    "policyVersion": "v1.0"
  }'

5.2 Batch simulation (regression set)

Send up to 50 items to verify your policy handles edge cases without breaking critical safe queries:

curl -s https://api.njira.ai/v1/sdk/simulation/batch \
  -H "Authorization: Bearer $NJIRA_API_KEY" \
  -H "Content-Type: application/json" \
  -d '{
    "policyId": "payments_guard",
    "policyVersion": "v1.0",
    "items": [
      { "inputText": "Transfer $500", "toolName": "bank_transfer" },
      { "inputText": "Transfer $75,000", "toolName": "bank_transfer" }
    ]
  }'

5.3 Trace replay

You can replay a historical production trace against a new draft to see what would have happened. Find traceId values in your Console under the Traces tab.

Replay evaluates the trace against the specified version of the trace's original policy pack.

curl -X POST https://api.njira.ai/v1/sdk/traces/{traceId}/replay \
  -H "Authorization: Bearer $NJIRA_API_KEY" \
  -H "Content-Type: application/json" \
  -d '{
    "policyId": "payments_guard",
    "policyVersion": "v1.0"
  }'

6) Rollout playbook: Shadow → Active

When shipping a new policy or a significant change to existing rules, use this sequence to prevent incidents.

Test in Simulation: Run your new rules against batch regressions and manual tests.
Deploy to Shadow: Use the UI or API bindings to attach the new version to prod with mode shadow.
Observe: Leave the policy in shadow for a full business cycle or sufficient traffic minimum.
Review Metrics: Filter your trace logs for the new version and observe the would-have-blocked rate. Check for anomalous spikes.
Promote to Active: If the metrics look good, re-bind the policy in prod to active mode and monitor live enforcement.
Rollback: If you discover a critical issue after enforcing, immediately re-bind the previous known-good version to prod.

What to monitor during rollout

Keep an eye on these operational signals when promoting a policy:

Overall BLOCK rate, and BLOCK rate segmented per tool.
Top fired reason codes (are you seeing the block reasons you expect?).
p95 policy latency addition.
would-have-blocked trace fields in shadow mode versus actual live errors.

7) Troubleshooting: False positive triage

When a safe request gets blocked (a false positive), execute this standard triage runbook:

Verify scope: Check the trace to confirm if the toolName or event_type was scoped too broadly. Tighter scoping is the safest fix.
Loosen the detector: If a regex is catching safe terms, or an LLM classifier threshold is too sensitive, adjust the detector config.
Change the action: If you're unsure if a pattern is truly bad, downgrade the rule action from BLOCK to REQUIRE_APPROVAL or MODIFY.
Update regression set: Add the false positive benchmark payload to your batch simulation regression suite to ensure it isn't blocked in the future.
Redeploy: Cut a new draft version, validate it, and roll it out.