"Do you sit between us and the model provider?"

"Only as policy and audit. BYO-key is standard — your request goes to your provider under your key. We don't meter or resell tokens."

"Which models can it route to?"

"Anthropic, OpenAI, Mistral and locally-run models. The gateway abstracts them, so you swap vendors by config, not by re-integration."

"How is DLP enforced?"

"In policy code at the boundary. PII, secrets and regulated data are checked before a request reaches any model; violations are denied and logged."

"What exactly is logged?"

"Every prompt, response, tool call, model choice and policy decision — the audit record regulators ask for. You control retention."

"Can we run it fully offline?"

"Yes. On-prem and air-gapped deployments run against locally-hosted models with no external connectivity."

"Can we take it in-house?"

"Yes. Config and policies are code; the clone handover gives your team a running gateway from day 30."

Back to Products

PRODUCTS · NOVA9 AGENTOS

Every model request, through one control point you own.

TL;DR. The LLM Gateway sits in front of every model your agents and people use. It routes by intent, enforces DLP and policy in code, caps budgets, scores quality with reinforcement learning, and logs every token. Multi-LLM, BYO-key, EU-sovereign or fully on-prem.

Multi-LLM routing Starlark DLP policy Every token audited BYO-key

Book a conversation Back to Products

What this is about

The moment AI enters an organisation, a question follows: which model saw what, under whose key, at what cost, and can you prove it? Without a control point, the answer is "we don't know" — which is unacceptable in a regulated business. The LLM Gateway is that control point. Every request from every agent, copilot or integration passes through it, governed by policy you can read and audit.

How we run it

The gateway runs in your EU cloud or on-prem. Clients connect with a bearer token; the gateway decides which model handles the request (by intent, cost, availability), applies DLP and policy in code, enforces per-tenant and per-squad budgets, and falls back across models on failure. Reinforcement-learning scoring tracks output quality so routing improves over time. Keys are managed centrally — none scattered in repos or workflows. And every prompt, response, tool call and model choice is logged for audit.

When it fits

Any organisation running AI in production that needs governance: which model, which data, which cost, provable. Finance, healthcare, public sector and KRITIS operators where "we don't know which model generated this" is not an answer. Teams running multiple agents and copilots that need one policy plane over all of them.

What we don't do

We don't sit between you and your model provider as a paid router — BYO-key is standard, you pay the provider directly. We don't persist your prompts beyond the audit log you control. We don't lock model choice to one vendor — the gateway abstracts them.

POLICY AS CODE

Your AI rules, versioned and diffable.

DLP, budgets, model whitelists, role gates — all expressed in code, not a console. One source of truth, auditable, and the rule travels with the request.

DLP at the boundary

Block PII, secrets and regulated data before they ever reach a model.
Budget caps

Per tenant, per squad, per day. Overflow routes to a cheaper fallback, not a surprise bill.
Model whitelist

Decide which models are allowed for which workloads. Swap vendors by config.
Role gates

Some requests require a role (e.g. DPO approval for PII). Enforced, logged, provable.

# LLM gateway policy (Starlark)
def on_request(req):
    if req.contains_pii() and not req.user.has_role("dpo"):
        return deny("PII without DPO approval")
    if req.tokens > budget.daily_remaining(req.tenant):
        return route("fallback-model")
    if req.model not in policy.whitelist(req.tenant):
        return deny("model not permitted for tenant")
    return allow()

Concrete Deliverables

What you can hand off

Gateway deployment

In your EU cloud or on-prem. Clients connect by bearer token; no keys in repos.
Policy as code

DLP, budgets, model whitelist and role gates expressed in Starlark — diffable and auditable.
Multi-LLM routing & fallback

Intent-based routing across Anthropic, OpenAI, Mistral and locally-run models, with fallback chains.
RL quality scoring

Reinforcement-learning scoring of outputs so routing improves and weak paths are flagged.
Full audit log

Every prompt, response, tool call and model choice logged — the record an auditor asks for.
Clone-ready config

Gateway config and policies as code — yours to take in-house from day 30.

A REQUEST, GOVERNED

What happens to one model request.

From client call to audited response — every decision the gateway makes is policy-driven and logged.

Governed Request

Step 1 Client (agent/copilot) calls the gateway with a bearer token.
Step 2 DLP scan: PII, secrets and regulated data checked against policy.
Step 3 Budget check against tenant/squad daily remaining; overflow routes to fallback.
Step 4 Model selected by intent, cost and availability from the tenant whitelist.
Step 5 Request sent under your key (BYO-key); response scored by the RL layer.
Step 6 Prompt, response, model choice and policy decisions written to the audit log.

Fallback Chain

Step 1 Primary model times out or errors.
Step 2 Gateway retries on the next model in the tenant's fallback chain.
Step 3 Degradation logged; quality delta tracked for the routing model.

Product facts

Date: 2026-05-27 · Source: dynexo Operations
Models	Anthropic, OpenAI, Mistral, locally-run (Llama family, GPT-OSS family)
Key model	BYO-key standard · you pay the provider directly
Policy	Starlark — DLP, budgets, whitelist, role gates · diffable
Routing	Intent-based + RL quality scoring + fallback chains
Audit	Every token, response, tool call and model choice logged
Deployment	EU cloud, on-premise or air-gapped
Persistence	Logs for audit; prompts not retained beyond your log
Clone handover	Config and policies as code, from day 30

Asked often

Asked before the briefing

Do you sit between us and the model provider?

Only as policy and audit. BYO-key is standard — your request goes to your provider under your key. We don't meter or resell tokens.
Which models can it route to?

Anthropic, OpenAI, Mistral and locally-run models. The gateway abstracts them, so you swap vendors by config, not by re-integration.
How is DLP enforced?

In policy code at the boundary. PII, secrets and regulated data are checked before a request reaches any model; violations are denied and logged.
What exactly is logged?

Every prompt, response, tool call, model choice and policy decision — the audit record regulators ask for. You control retention.
Can we run it fully offline?

Yes. On-prem and air-gapped deployments run against locally-hosted models with no external connectivity.
Can we take it in-house?

Yes. Config and policies are code; the clone handover gives your team a running gateway from day 30.

Next step

One control point in front of every model.

We deploy the gateway against your stack, write a first policy set, and show the audit log an auditor would actually accept.

Book a conversation Read about agent integration

Every model request, through one control point you own.

What this is about

How we run it

When it fits

What we don't do

DLP at the boundary

Budget caps

Model whitelist

Role gates

Governed Request

Fallback Chain

One control point in front of every model.