CLI Reference

All AgentSpec CLI commands. Run via agentspec or install globally: npm i -g @agentspec/cli.

LLM requirements at a glance

Command	Requires LLM API key?
`init`, `validate`, `health`, `audit`	✅ No — fully local
`export`, `diff`, `migrate`	✅ No — fully local
`generate`	🔑 Yes
`scan`	🔑 Yes
`evaluate`	✅ No (calls your agent's own endpoint)
`generate --deploy k8s`	✅ No — deterministic templates

`agentspec init`

Interactive wizard to create agent.yaml.

bash

agentspec init [dir]

Options:

--yes — skip prompts, create a minimal manifest

`agentspec validate`

Validate manifest schema. No I/O — safe for pre-commit hooks.

bash

agentspec validate <file>
agentspec validate agent.yaml --json

Options:

--json — output validation result as JSON

Exit codes: 0 = valid, 1 = invalid

`agentspec health`

Runtime health checks — calls external services.

bash

agentspec health <file>
agentspec health agent.yaml --json
agentspec health agent.yaml --fail-on warning    # exit 1 on warnings
agentspec health agent.yaml --no-model           # skip model API check
agentspec health agent.yaml --no-mcp             # skip MCP checks
agentspec health agent.yaml --no-memory          # skip memory checks

Options:

--json — output as JSON
--format json|table — output format (default: table)
--fail-on error|warning|info — exit 1 threshold (default: error)
--no-model — skip model API reachability
--no-mcp — skip MCP server checks
--no-memory — skip memory backend checks

Exit codes: 0 = healthy/degraded (by default), 1 = fails --fail-on threshold

`agentspec audit`

Compliance audit against configured packs.

bash

agentspec audit <file>
agentspec audit agent.yaml --pack owasp-llm-top10
agentspec audit agent.yaml --json --output report.json
agentspec audit agent.yaml --fail-below 70

# Dual score: fetch proof records from sidecar and compute proved score
agentspec audit agent.yaml --url http://localhost:4001
agentspec audit agent.yaml --url http://localhost:4001 --json

Options:

--pack <pack> — run only this pack
--url <url> — sidecar base URL; fetches GET /proof and merges proof records to compute provedScore
--json — output as JSON
--output <file> — write JSON report to file
--fail-below <score> — exit 1 if declared score < threshold

With --url, the audit report includes two scores:

  Declared score : D  65/100  — what your spec says
  Proved score   : F  35/100  — what has been verified
  Pending proof  : 4 rules — run external tools and POST to http://localhost:4001/proof/rule/:ruleId

JSON output with --url:

json

{
  "overallScore": 65,
  "grade": "D",
  "provedScore": 35,
  "provedGrade": "F",
  "pendingProofCount": 4,
  "violations": [...]
}

The provedScore counts only rules verified by:

[P] probed rules that pass their health check
[B] behavioral rules observed via EventPush
[X] external rules with a proof record submitted via POST /proof/rule/:ruleId

Exit codes: 0 = audit complete (check score), 1 = below threshold

See Proof Integration Guide for how to submit proof records from external tools.

`agentspec generate`

🔑 Requires an LLM API key. Generate framework-specific agent code using a codegen provider.

bash

agentspec generate <file> --framework <fw> --output <dir>
agentspec generate agent.yaml --framework langgraph --output ./generated/
agentspec generate agent.yaml --framework crewai --output ./generated/
agentspec generate agent.yaml --framework langgraph --dry-run

Options:

--framework <fw> — required: langgraph | crewai | mastra | autogen
--output <dir> — output directory (default: ./generated)
--dry-run — print files without writing
--deploy <target> — also generate deployment manifests: k8s | helm
--push — write .env.agentspec with push mode env var placeholders (AGENTSPEC_URL, AGENTSPEC_KEY)
--provider <name>: override codegen provider: claude-sub, anthropic-api, openai-compatible

Requires a codegen provider. Generation uses an LLM to reason over every manifest field and produce complete, production-ready code. Three providers are supported (auto-detected):

bash

# Option A: Claude subscription (Pro / Max), no API key needed
claude auth login
agentspec generate agent.yaml --framework langgraph

# Option B: any OpenAI-compatible endpoint (OpenRouter, Groq, Together, Ollama, OpenAI, ...)
export AGENTSPEC_LLM_API_KEY=sk-or-v1-...
export AGENTSPEC_LLM_MODEL=qwen/qwen3-235b-a22b
export AGENTSPEC_LLM_BASE_URL=https://openrouter.ai/api/v1
agentspec generate agent.yaml --framework langgraph

# Option C: Anthropic API key
export ANTHROPIC_API_KEY=sk-ant-...
agentspec generate agent.yaml --framework langgraph

Check which method is active: agentspec provider-status

Optional env vars:

Variable	Default	Description
`AGENTSPEC_CODEGEN_PROVIDER`	`auto`	Force provider: `claude-sub`, `anthropic-api`, or `openai-compatible`
`ANTHROPIC_MODEL`	`claude-opus-4-6` (API), `claude-sonnet-4-6` (subscription)	Model used for generation (Anthropic providers)
`ANTHROPIC_BASE_URL`	Anthropic API	Custom proxy or private endpoint (API mode only)
`AGENTSPEC_LLM_API_KEY`	(none)	API key for OpenAI-compatible provider. Use a dummy string for local Ollama.
`AGENTSPEC_LLM_MODEL`	(none, required when API key is set)	Model ID on the OpenAI-compatible endpoint
`AGENTSPEC_LLM_BASE_URL`	`https://api.openai.com/v1`	Endpoint root for OpenAI-compatible provider. Include `/v1`.

bash

# Use a faster/cheaper model
export ANTHROPIC_MODEL=claude-sonnet-4-6
# Force API mode in CI
export AGENTSPEC_CODEGEN_PROVIDER=anthropic-api

agentspec generate agent.yaml --framework langgraph

`--deploy k8s`

Generates plain Kubernetes manifests alongside (or instead of) framework code. Does not require ANTHROPIC_API_KEY — output is deterministic.

bash

# Framework code + k8s manifests in one pass
agentspec generate agent.yaml --framework langgraph --deploy k8s

# k8s only (no framework code)
agentspec generate agent.yaml --framework langgraph --deploy k8s --output ./k8s-out/

Files written under <output>/k8s/:

File	Contents
`k8s/deployment.yaml`	Agent container + `agentspec-sidecar` sidecar (ports 4000/4001)
`k8s/service.yaml`	ClusterIP exposing agent, proxy (4000), and control (4001) ports
`k8s/configmap.yaml`	Non-secret config: `AGENT_NAME`, `MODEL_PROVIDER`, `MODEL_ID`
`k8s/secret.yaml.example`	Template listing every `$env:` ref — fill with real values and apply separately

bash

# Apply to a cluster
kubectl apply -f ./generated/k8s/configmap.yaml
# Fill in real values first:
cp ./generated/k8s/secret.yaml.example ./generated/k8s/secret.yaml
# Edit secret.yaml with base64-encoded values, then:
kubectl apply -f ./generated/k8s/secret.yaml
kubectl apply -f ./generated/k8s/deployment.yaml
kubectl apply -f ./generated/k8s/service.yaml

`--deploy helm`

Generates a full Helm chart using a codegen provider.

bash

agentspec generate agent.yaml --framework langgraph --deploy helm

Writes a complete Helm chart (Chart.yaml, values.yaml, templates/, _helpers.tpl, README.md) alongside the framework code. The chart always includes agentspec-sidecar as a sidecar container.

bash

helm install my-agent ./generated/ -f generated/values.yaml

Exit codes: 0 = files written, 1 = unknown framework/deploy target, missing API key, or generation error.

`agentspec export`

Export manifest to other formats.

bash

agentspec export <file> --format agentcard
agentspec export <file> --format agents-md-block

Options:

--format agentcard — Google A2A/AgentCard JSON
--format agents-md-block — AGENTS.md reference block (markdown)

`agentspec scan`

🔑 Requires an LLM API key. Scan a source directory and generate an agent.yaml manifest using a codegen provider.

bash

agentspec scan --dir ./src/
agentspec scan --dir ./src/ --out agent.yaml        # explicit output path
agentspec scan --dir ./src/ --update                # overwrite existing agent.yaml
agentspec scan --dir ./src/ --dry-run               # print to stdout, don't write

Options:

--dir <path> — required: source directory to scan
--out <path> — explicit output path (default: ./agent.yaml or ./agent.yaml.new)
--update — overwrite existing agent.yaml in place (default: writes agent.yaml.new)
--dry-run — print generated YAML to stdout without writing any file
--provider <name>: override codegen provider: claude-sub, anthropic-api, openai-compatible

Output path logic:

Situation	File written
No existing `agent.yaml`	`agent.yaml`
Existing `agent.yaml`, no `--update`	`agent.yaml.new` (original untouched)
Existing `agent.yaml` + `--update`	`agent.yaml` (overwritten)
`--out <path>`	that path, always
`--dry-run`	stdout only

What the LLM detects:

Pattern in source	Manifest field
`import openai` / `ChatOpenAI(model=…)`	`spec.model.provider`, `spec.model.name`
`os.getenv("OPENAI_API_KEY")`	`spec.model.apiKey: $env:OPENAI_API_KEY`
`@tool` decorator, `Tool(name=…)`	`spec.tools[]`
`MCPClient(…)` config	`spec.mcp[]`
Content filter / rate limiter import	`spec.guardrails.*`
`import deepeval` / `import pytest`	`spec.eval.hooks[]`
Redis / Postgres / vector store import	`spec.memory.backend`

Scans .py, .ts, .js, .mjs, .cjs files only. Excludes node_modules/, .git/, dist/, .venv/ and other non-user directories. Caps at 50 files and 200 KB of source content per scan.

Requires a codegen provider. Uses the same auto-detection as generate.

bash

# Option A: Claude subscription
claude auth login
agentspec scan --dir ./src/ --dry-run   # preview before writing
agentspec scan --dir ./src/             # write agent.yaml

# Option B: any OpenAI-compatible endpoint
export AGENTSPEC_LLM_API_KEY=sk-or-v1-...
export AGENTSPEC_LLM_MODEL=qwen/qwen3-235b-a22b
export AGENTSPEC_LLM_BASE_URL=https://openrouter.ai/api/v1
agentspec scan --dir ./src/

# Option C: Anthropic API key
export ANTHROPIC_API_KEY=sk-ant-...
agentspec scan --dir ./src/

Check which method is active: agentspec provider-status

Exit codes: 0 = manifest written, 1 = auth missing or generation error.

`agentspec provider-status`

Show codegen provider status — which provider is active, account details, API key validity, and which provider generate / scan would use right now.

bash

agentspec provider-status
agentspec provider-status --json

Options:

--json — machine-readable output (useful in CI to inspect auth state)

Example output:

  AgentSpec — Provider Status
  ─────────────────────────────

Claude subscription
  ✓ Installed              yes
    Version                2.1.81 (Claude Code)
  ✓ Authenticated          yes
  ✓ Account                you@example.com
  ✓ Plan                   Claude Pro

Anthropic API
  ✗ ANTHROPIC_API_KEY      not set
  – ANTHROPIC_BASE_URL     not set (using default)

Environment & resolution
  – Provider override      not set (auto-detect)
  – Model override         not set (default: claude-opus-4-6)

  ✓ Would use: Claude subscription

──────────────────────────────────────────────────
✓ Ready — Claude subscription (Claude Pro) · you@example.com
  agentspec generate and scan will use the claude-subscription provider

What it checks:

Section	What is probed
Claude subscription	`claude --version`, `claude auth status` — version, login state, account email, plan
Anthropic API	`ANTHROPIC_API_KEY` presence + live HTTP probe to `/v1/models`, `ANTHROPIC_BASE_URL`
Environment	`AGENTSPEC_CODEGEN_PROVIDER`, `ANTHROPIC_MODEL` overrides, resolved provider

Exit codes: 0 = at least one provider is ready, 1 = no provider available.

`agentspec diff`

Detect compliance drift between two agent.yaml manifests. Deterministic — no LLM required.

bash

agentspec diff agent.yaml agent.yaml.new
agentspec diff agent.yaml agent.yaml.new --json        # machine-readable output
agentspec diff agent.yaml agent.yaml.new --exit-code   # exit 1 if drift detected

Options:

--json — output diff result as JSON (useful for CI)
--exit-code — exit with code 1 if any changes are detected

Human-readable output:

agentspec diff — compliance drift analysis
══════════════════════════════════════════════════════
  Comparing: agent.yaml → agent.yaml.new

  REMOVED  spec.guardrails.content_filter         [-15 score]  HIGH
           Content filtering removed — user input reaches model unfiltered

  ADDED    spec.tools.0.name                      [+0 score]   LOW
           New tool added — verify it does not expose sensitive data

  Net score change:  -15  (100 → 85, A → B)

  Recommendation: restore spec.guardrails.content_filter before deploying
══════════════════════════════════════════════════════

JSON output schema (--json):

json

{
  "from": "agent.yaml",
  "to": "agent.yaml.new",
  "scoreFrom": 100,
  "scoreTo": 85,
  "gradeFrom": "A",
  "gradeTo": "B",
  "netScoreChange": -15,
  "changes": [
    {
      "type": "removed",
      "property": "spec.guardrails.content_filter",
      "severity": "HIGH",
      "scoreImpact": -15,
      "description": "Content filtering removed — user input reaches model unfiltered"
    }
  ]
}

Score note: scoreFrom is always 100 (relative baseline). The diff measures drift magnitude, not absolute compliance. Run agentspec audit on each file for absolute scores.

Severity levels:

Severity	Examples	Score impact
`HIGH`	Guardrail removed, API key reference removed	−10 to −15
`MEDIUM`	Model name/provider changed, eval hooks removed	−5 to −8
`LOW`	New tool added, observability removed	0 to −3

Exit codes: 0 = no drift (or drift without --exit-code), 1 = drift detected with --exit-code

`agentspec evaluate`

Run a declared JSONL evaluation dataset against a live agent and score actual outputs.

bash

agentspec evaluate <file> --url <url> --dataset <name>
agentspec evaluate agent.yaml --url http://localhost:4000 --dataset golden-qa
agentspec evaluate agent.yaml --url http://localhost:4000 --dataset golden-qa --json
agentspec evaluate agent.yaml --url http://localhost:4000 --dataset golden-qa --sample-size 20
agentspec evaluate agent.yaml --url http://localhost:4000 --dataset golden-qa --tag planning

Options:

--url <url> — required: agent base URL (e.g. http://localhost:4000)
--dataset <name> — required: dataset name from spec.evaluation.datasets[]
--sample-size <n> — run only N randomly-selected samples (default: all)
--tag <tag> — filter samples to those with a matching tag
--timeout <ms> — per-request timeout in milliseconds (default: 10000)
--json — output JSON instead of human-readable table

Dataset format (JSONL): one sample per line:

jsonl

{"input": "What exercises for bad knees?", "expected": "low-impact"}
{"input": "Design a 5-day plan", "expected": "rest day", "tags": ["planning"]}

input — sent as the user message to POST /v1/chat (or spec.api.chatEndpoint.path)
expected — substring that must appear in the response (case-insensitive)
tags — optional; used for --tag filtering

Scoring:

Metric	Description
`pass_rate`	Fraction of samples where the expected substring is found in the response (case-insensitive)

CI gate: if spec.evaluation.ciGate: true and spec.evaluation.thresholds.pass_rate is set, the command exits 1 when the measured pass rate is below the threshold.

Sample output:

  AgentSpec Evaluate — golden-qa
  ─────────────────────────────────────────────────────────────────
  Evaluating: golden-qa  42 samples  agent: http://localhost:4000

  ✓   1  "What exercises for bad knees?" → found "low-impact" [0.12s]
  ✓   2  "Design a 5-day plan" → found "rest day" [0.09s]
  ✗   7  "Can I train every day?" → expected "recovery" not found [0.14s]

  Results
    pass_rate     86%  (threshold: 80%)  PASS

  ciGate: PASS
  Exit code: 0

JSON output (--json):

json

{
  "dataset": "golden-qa",
  "agentUrl": "http://localhost:4000",
  "totalSamples": 42,
  "metrics": { "pass_rate": 0.86 },
  "threshold": 0.8,
  "ciGateResult": "PASS",
  "samples": [...]
}

Exit codes: 0 = evaluation complete (or ciGate not configured), 1 = ciGate threshold not met

See Probe Coverage for how agentspec evaluate fits into the evidence tier system.

`agentspec migrate`

Migrate an agent.yaml manifest to the latest schema version.

bash

agentspec migrate agent.yaml              # migrate in-place
agentspec migrate agent.yaml --dry-run   # preview changes, no files written
agentspec migrate agent.yaml -o out.yaml # write result to a different file

Options:

--dry-run — print the migrated manifest without writing any files
-o, --output <file> — write the result to a different file (default: overwrites input)

If the manifest is already at the latest version the command prints a success message and exits 0 without modifying any file.

Exit codes: 0 = already up-to-date or migrated successfully, 1 = no migration path found

CLI Reference ​

LLM requirements at a glance ​

agentspec init ​

agentspec validate ​

agentspec health ​

agentspec audit ​

agentspec generate ​

--deploy k8s ​

--deploy helm ​

agentspec export ​

agentspec scan ​

agentspec provider-status ​

agentspec diff ​

agentspec evaluate ​

agentspec migrate ​

CLI Reference

LLM requirements at a glance

`agentspec init`

`agentspec validate`

`agentspec health`

`agentspec audit`

`agentspec generate`

`--deploy k8s`

`--deploy helm`

`agentspec export`

`agentspec scan`

`agentspec provider-status`

`agentspec diff`

`agentspec evaluate`

`agentspec migrate`