v0.2.0 · Apache 2.0

LLM agents, reconciled.

The Kubernetes operator for controlling LLM agents at scale.
Every guardrail enforced at the infrastructure level.

research-analyst.yaml

# research-analyst.yaml
spec:
  model: claude-sonnet-4-20250514
  prompt:
    inline: "You are a senior research analyst."
  tools:
    mcp:
      - name: web-search
        url: https://search.mcp.internal/sse
        auth:
          bearer:
            secretKeyRef: { name: mcp-tokens, key: search }
  guardrails:
    tools:
      allow: ["web-search/*"]
      deny: ["shell/*"]
    budgetRef:
      name: monthly-10k
  runtime:
    replicas: 3
  observability:
    healthCheck:
      type: semantic
      prompt: "Reply OK if ready."

AI agents are everywhere. Control over them is nowhere.

No spend limits

Teams burn through LLM budgets with no enforcement. You find out when the invoice arrives.

No governance

Agents call arbitrary tools, access any API and produce unvalidated output. No audit trail.

No isolation

API keys in env vars, no namespace boundaries, no network policies. Every agent is a blast radius.

No standard

Every team deploys agents differently. No unified health checks, scaling, or incident response.

Full control. Every layer.

Ten concerns your platform team needs to manage. All declarative. All enforced by the operator.

Define

Agents as Kubernetes resources. RBAC, GitOps and namespace isolation built in.

SwarmAgent

Connect

MCP servers with auth enforcement. Dynamic tool discovery at runtime.

SwarmAgent

Orchestrate

Pipeline DAGs with validation gates. Output verified before it flows downstream.

SwarmTeam

Govern

Audit, warn, or enforce. Test policies before hard-enforcing. No agent can opt out.

SwarmPolicy

Control cost

Rolling budgets with warning thresholds. Hard stop at admission. Thinking token caps per call.

SwarmBudget

Define

Agents as Kubernetes resources. RBAC, GitOps and namespace isolation built in.

SwarmAgent

Connect

MCP servers with auth enforcement. Dynamic tool discovery at runtime.

SwarmAgent

Orchestrate

Pipeline DAGs with validation gates. Output verified before it flows downstream.

SwarmTeam

Govern

Audit, warn, or enforce. Test policies before hard-enforcing. No agent can opt out.

SwarmPolicy

Control cost

Rolling budgets with warning thresholds. Hard stop at admission. Thinking token caps per call.

SwarmBudget

Remember

Persistent memory across runs. Redis, Qdrant, or pgvector. Agents learn, not just execute.

SwarmMemory

Optimize

Tool result sandboxing cuts tokens by 53%. Context compression when the window fills.

SwarmAgent

Scale

Demand-driven scaling on queue depth. Scale to zero when idle.

SwarmAgent

Audit

Causal chain tracing with W3C context. Structured JSON, OTel-native, redaction rules.

SwarmRun

Extend

gRPC plugin escape hatches. Bring your own LLM provider or queue backend.

SwarmRegistry

Remember

Persistent memory across runs. Redis, Qdrant, or pgvector. Agents learn, not just execute.

SwarmMemory

Optimize

Tool result sandboxing cuts tokens by 53%. Context compression when the window fills.

SwarmAgent

Scale

Demand-driven scaling on queue depth. Scale to zero when idle.

SwarmAgent

Audit

Causal chain tracing with W3C context. Structured JSON, OTel-native, redaction rules.

SwarmRun

Extend

gRPC plugin escape hatches. Bring your own LLM provider or queue backend.

SwarmRegistry

Three orchestration modes. One resource.

Pipeline DAGs for deterministic chains, routed dispatch when an LLM picks the specialist, dynamic delegation when agents call each other at runtime. All three are SwarmTeam in YAML

Deterministic chains with validation gates

Ordered steps with dependsOn.
Parallel branches, quality gates and a revision loop - all declared in YAML.

Waiting for trigger…

One request, one specialist

A router LLM classifies each request and picks the single best agent. One hop. Like a load balancer with brains.

Waiting for request…

Agents delegate in chains

No predefined order. Each agent decides at runtime who to call next. Multi-hop delegation - the path emerges from the task.

Idle…

+ Gateway

Need a single entrypoint?

One agent becomes the front door for external requests. It wraps any of the three modes above behind a single endpoint. Optional - add it when your organization needs one URL for the swarm.

gateway-agent.yaml

# SwarmAgent as gateway entrypoint
spec:
  gateway:
    registryRef: { name: platform-registry }
    dispatchMode: enabled
    maxDispatchDepth: 3
    fallback: { mode: answer-directly }

Built for compliance, not demos.

Agent frameworks give you building blocks. kubeswarm gives your organization control.

Security & governance

Network policies auto-generated
Every agent pod gets a NetworkPolicy scoped to its declared MCP servers and queue backend. No manual YAML.
Pod security hardened
Non-root user, read-only root filesystem, all capabilities dropped, RuntimeDefault seccomp profile. Matches CIS benchmarks.
Tool allow/deny with trust levels
Glob patterns on server/tool paths. Trust levels (internal, external, sandbox) control validation depth.
Prompt injection defense
Pipeline step outputs are wrapped in structural delimiters. Downstream agents are instructed to treat them as untrusted data.

Operations & cost

Set a budget. Kubernetes enforces it.
Rolling daily, weekly, or monthly budgets with configurable warning thresholds. Scoped by namespace, team, or label. Hard stop at admission - not a dashboard alert, an API-level block.
Circuit breakers and retry budgets
After 5 consecutive LLM failures the circuit opens. Cooldown, then half-open probes. Retries have a per-task cap. No infinite loops.
Audit trail with causal chain
Every tool call, delegation and LLM turn is recorded with parentEventID linking. Structured JSON, OTel-native, redaction rules included.
OTel-native observability
Traces and metrics export via OpenTelemetry. W3C TraceContext propagation across queue boundaries. Works with Jaeger, Grafana, Datadog out of the box.
53% fewer tokens per task
Tool result sandboxing replaces large outputs with compact digests. Context compression auto-summarizes when the window fills. Measured 53% reduction on 7KB tool results.

auto-generated NetworkPolicy

# Auto-generated by the operator
kind: NetworkPolicy
metadata:
  name: research-agent-egress
spec:
  podSelector:
    matchLabels:
      kubeswarm.io/agent: research-agent
  egress:
    - to: [{ podSelector: { matchLabels:
        { app: mcp-search } } }]
      ports: [{ port: 8080 }]

SwarmBudget

# SwarmBudget - hard stop at $10/day
apiVersion: kubeswarm.io/v1alpha1
kind: SwarmBudget
metadata:
  name: daily-cap
spec:
  period: daily
  limit: 10
  currency: USD
  warnAt: 80
  hardStop: true

Semantic health checks prompt the model and evaluate the response - not just HTTP 200. The operator knows when an agent is loaded but broken. Falls back to ping mode for cost-sensitive deployments.

Runs with your stack

Kubernetes-native. No vendor lock-in.

Use the runtime, model providers, memory stores and operators your platform team already trusts.

Framework vs. infrastructure.

Frameworks help you build agents. kubeswarm helps your organization control them.

What your org needs	Agent frameworks	kubeswarm
Governance
Agents as managed resources	Objects in application code	RBAC, GitOps, namespace isolation
Spend enforcement	Left to application code	Hard stop at admission - before tokens are spent
Namespace-wide policy	Varies by framework	Budgets, model restrictions, mandatory audit
Security
Tool permissions	Per-call code review	Allow/deny lists with trust levels
Output validation	Library helpers, opt-in	Regex, schema, semantic, injection detection
Audit trail	Print statements	Full trace with automatic redaction
Operations
Multi-mode orchestration	DAG in code; routed/dynamic hand-rolled	Pipeline, routed, dynamic - all in YAML
Agent discovery	Hardcoded handoffs	Auto-indexed registry, runtime resolution
Demand-driven scaling	Left to application code	Queue-depth scaling, scale to zero
Portability
Vendor lock-in	Framework SDK required in every agent	None. Delete the operator, agents keep running.

Governance
Agents as managed resources
Agent frameworks
Objects in application code
kubeswarm
RBAC, GitOps, namespace isolation
Spend enforcement
Agent frameworks
Left to application code
kubeswarm
Hard stop at admission - before tokens are spent
Namespace-wide policy
Agent frameworks
Varies by framework
kubeswarm
Budgets, model restrictions, mandatory audit
Security
Tool permissions
Agent frameworks
Per-call code review
kubeswarm
Allow/deny lists with trust levels
Output validation
Agent frameworks
Library helpers, opt-in
kubeswarm
Regex, schema, semantic, injection detection
Audit trail
Agent frameworks
Print statements
kubeswarm
Full trace with automatic redaction
Operations
Multi-mode orchestration
Agent frameworks
DAG in code; routed/dynamic hand-rolled
kubeswarm
Pipeline, routed, dynamic - all in YAML
Agent discovery
Agent frameworks
Hardcoded handoffs
kubeswarm
Auto-indexed registry, runtime resolution
Demand-driven scaling
Agent frameworks
Left to application code
kubeswarm
Queue-depth scaling, scale to zero
Portability
Vendor lock-in
Agent frameworks
Framework SDK required in every agent
kubeswarm
None. Delete the operator, agents keep running.

Operator footprint and scope.

The questions your platform team will ask before installing anything new in their cluster.

What it installs

Single Deployment, leader-elected for HA.
Default: 100m / 128Mi request, 500m / 512Mi limit.
If the operator goes down, agents keep serving from their last reconciled state.
helm uninstall removes the operator. Agent pods stay running.

What it is not

Not an inference server. Bring your own provider.
Not a model gateway or router proxy.
Not a vector database. Wires existing ones.
Not an agent framework. Your code stays in your container.
Not a prompt library or eval harness.

v1alpha1 CRDs. Pre-1.0. Apache 2.0. Conversion webhooks ship with every version bump. Get in touch if you're running LLM workloads on Kubernetes.

Your agents. Your cluster.
Your rules.

Open source. Apache 2.0. No vendor lock-in. Deploy on your infrastructure.

helm repo add kubeswarm https://kubeswarm.github.io/helm-charts && helm install kubeswarm kubeswarm/kubeswarm

Get started Get in touch

Local CLIDev clusterProduction

Get in touch

Running LLM workloads on Kubernetes? We'd like to hear about your use case.

LLM agents, reconciled.

Define

Connect

Orchestrate

Govern

Control cost

Define

Connect

Orchestrate

Govern

Control cost

Remember

Optimize

Scale

Audit

Extend

Remember

Optimize

Scale

Audit

Extend

Deterministic chains with validation gates

One request, one specialist

Agents delegate in chains

Need a single entrypoint?

Security & governance

Operations & cost

Vendor lock-in

What it installs

What it is not

Get in touch