Skip to content

CLI Reference

All AgentSpec CLI commands. Run via agentspec or install globally: npm i -g @agentspec/cli.

agentspec init

Interactive wizard to create agent.yaml.

bash
agentspec init [dir]

Options:

  • --yes — skip prompts, create a minimal manifest

agentspec validate

Validate manifest schema. No I/O — safe for pre-commit hooks.

bash
agentspec validate <file>
agentspec validate agent.yaml --json

Options:

  • --json — output validation result as JSON

Exit codes: 0 = valid, 1 = invalid

agentspec health

Runtime health checks — calls external services.

bash
agentspec health <file>
agentspec health agent.yaml --json
agentspec health agent.yaml --fail-on warning    # exit 1 on warnings
agentspec health agent.yaml --no-model           # skip model API check
agentspec health agent.yaml --no-mcp             # skip MCP checks
agentspec health agent.yaml --no-memory          # skip memory checks

Options:

  • --json — output as JSON
  • --format json|table — output format (default: table)
  • --fail-on error|warning|info — exit 1 threshold (default: error)
  • --no-model — skip model API reachability
  • --no-mcp — skip MCP server checks
  • --no-memory — skip memory backend checks

Exit codes: 0 = healthy/degraded (by default), 1 = fails --fail-on threshold

agentspec audit

Compliance audit against configured packs.

bash
agentspec audit <file>
agentspec audit agent.yaml --pack owasp-llm-top10
agentspec audit agent.yaml --json --output report.json
agentspec audit agent.yaml --fail-below 70

# Dual score: fetch proof records from sidecar and compute proved score
agentspec audit agent.yaml --url http://localhost:4001
agentspec audit agent.yaml --url http://localhost:4001 --json

Options:

  • --pack <pack> — run only this pack
  • --url <url> — sidecar base URL; fetches GET /proof and merges proof records to compute provedScore
  • --json — output as JSON
  • --output <file> — write JSON report to file
  • --fail-below <score> — exit 1 if declared score < threshold

With --url, the audit report includes two scores:

  Declared score : D  65/100  — what your spec says
  Proved score   : F  35/100  — what has been verified
  Pending proof  : 4 rules — run external tools and POST to http://localhost:4001/proof/rule/:ruleId

JSON output with --url:

json
{
  "overallScore": 65,
  "grade": "D",
  "provedScore": 35,
  "provedGrade": "F",
  "pendingProofCount": 4,
  "violations": [...]
}

The provedScore counts only rules verified by:

  • [P] probed rules that pass their health check
  • [B] behavioral rules observed via EventPush
  • [X] external rules with a proof record submitted via POST /proof/rule/:ruleId

Exit codes: 0 = audit complete (check score), 1 = below threshold

See Proof Integration Guide for how to submit proof records from external tools.

agentspec generate

Generate framework-specific agent code using Claude.

bash
agentspec generate <file> --framework <fw> --output <dir>
agentspec generate agent.yaml --framework langgraph --output ./generated/
agentspec generate agent.yaml --framework crewai --output ./generated/
agentspec generate agent.yaml --framework langgraph --dry-run

Options:

  • --framework <fw>required: langgraph | crewai | mastra
  • --output <dir> — output directory (default: ./generated)
  • --dry-run — print files without writing
  • --deploy <target> — also generate deployment manifests: k8s | helm
  • --push — write .env.agentspec with push mode env var placeholders (AGENTSPEC_URL, AGENTSPEC_KEY)

Requires ANTHROPIC_API_KEY — generation uses Claude to reason over every manifest field and produce complete, production-ready code. Get a key at console.anthropic.com.

bash
export ANTHROPIC_API_KEY=your-api-key-here
agentspec generate agent.yaml --framework langgraph

Optional env vars:

VariableDefaultDescription
ANTHROPIC_MODELclaude-opus-4-6Claude model used for generation
ANTHROPIC_BASE_URLAnthropic APICustom proxy or private endpoint
bash
# Use a faster/cheaper model
export ANTHROPIC_MODEL=claude-sonnet-4-6
# Route through a proxy
export ANTHROPIC_BASE_URL=https://my-proxy.example.com

agentspec generate agent.yaml --framework langgraph

--deploy k8s

Generates plain Kubernetes manifests alongside (or instead of) framework code. Does not require ANTHROPIC_API_KEY — output is deterministic.

bash
# Framework code + k8s manifests in one pass
agentspec generate agent.yaml --framework langgraph --deploy k8s

# k8s only (no framework code)
agentspec generate agent.yaml --framework langgraph --deploy k8s --output ./k8s-out/

Files written under <output>/k8s/:

FileContents
k8s/deployment.yamlAgent container + agentspec-sidecar sidecar (ports 4000/4001)
k8s/service.yamlClusterIP exposing agent, proxy (4000), and control (4001) ports
k8s/configmap.yamlNon-secret config: AGENT_NAME, MODEL_PROVIDER, MODEL_ID
k8s/secret.yaml.exampleTemplate listing every $env: ref — fill with real values and apply separately
bash
# Apply to a cluster
kubectl apply -f ./generated/k8s/configmap.yaml
# Fill in real values first:
cp ./generated/k8s/secret.yaml.example ./generated/k8s/secret.yaml
# Edit secret.yaml with base64-encoded values, then:
kubectl apply -f ./generated/k8s/secret.yaml
kubectl apply -f ./generated/k8s/deployment.yaml
kubectl apply -f ./generated/k8s/service.yaml

--deploy helm

Generates a full Helm chart using Claude. Requires ANTHROPIC_API_KEY.

bash
agentspec generate agent.yaml --framework langgraph --deploy helm

Writes a complete Helm chart (Chart.yaml, values.yaml, templates/, _helpers.tpl, README.md) alongside the framework code. The chart always includes agentspec-sidecar as a sidecar container.

bash
helm install my-agent ./generated/ -f generated/values.yaml

Exit codes: 0 = files written, 1 = unknown framework/deploy target, missing API key, or generation error.

agentspec export

Export manifest to other formats.

bash
agentspec export <file> --format agentcard
agentspec export <file> --format agents-md-block

Options:

  • --format agentcard — Google A2A/AgentCard JSON
  • --format agents-md-block — AGENTS.md reference block (markdown)

agentspec scan

Scan a source directory and generate an agent.yaml manifest using Claude.

bash
agentspec scan --dir ./src/
agentspec scan --dir ./src/ --out agent.yaml        # explicit output path
agentspec scan --dir ./src/ --update                # overwrite existing agent.yaml
agentspec scan --dir ./src/ --dry-run               # print to stdout, don't write

Options:

  • --dir <path>required: source directory to scan
  • --out <path> — explicit output path (default: ./agent.yaml or ./agent.yaml.new)
  • --update — overwrite existing agent.yaml in place (default: writes agent.yaml.new)
  • --dry-run — print generated YAML to stdout without writing any file

Output path logic:

SituationFile written
No existing agent.yamlagent.yaml
Existing agent.yaml, no --updateagent.yaml.new (original untouched)
Existing agent.yaml + --updateagent.yaml (overwritten)
--out <path>that path, always
--dry-runstdout only

What Claude detects:

Pattern in sourceManifest field
import openai / ChatOpenAI(model=…)spec.model.provider, spec.model.name
os.getenv("OPENAI_API_KEY")spec.model.apiKey: $env:OPENAI_API_KEY
@tool decorator, Tool(name=…)spec.tools[]
MCPClient(…) configspec.mcp[]
Content filter / rate limiter importspec.guardrails.*
import deepeval / import pytestspec.eval.hooks[]
Redis / Postgres / vector store importspec.memory.backend

Scans .py, .ts, .js, .mjs, .cjs files only. Excludes node_modules/, .git/, dist/, .venv/ and other non-user directories. Caps at 50 files and 200 KB of source content per scan.

Requires ANTHROPIC_API_KEY.

bash
export ANTHROPIC_API_KEY=your-api-key-here
agentspec scan --dir ./src/ --dry-run   # preview before writing
agentspec scan --dir ./src/             # write agent.yaml

Exit codes: 0 = manifest written, 1 = API key missing or generation error.

agentspec diff

Detect compliance drift between two agent.yaml manifests. Deterministic — no LLM required.

bash
agentspec diff agent.yaml agent.yaml.new
agentspec diff agent.yaml agent.yaml.new --json        # machine-readable output
agentspec diff agent.yaml agent.yaml.new --exit-code   # exit 1 if drift detected

Options:

  • --json — output diff result as JSON (useful for CI)
  • --exit-code — exit with code 1 if any changes are detected

Human-readable output:

agentspec diff — compliance drift analysis
══════════════════════════════════════════════════════
  Comparing: agent.yaml → agent.yaml.new

  REMOVED  spec.guardrails.content_filter         [-15 score]  HIGH
           Content filtering removed — user input reaches model unfiltered

  ADDED    spec.tools.0.name                      [+0 score]   LOW
           New tool added — verify it does not expose sensitive data

  Net score change:  -15  (100 → 85, A → B)

  Recommendation: restore spec.guardrails.content_filter before deploying
══════════════════════════════════════════════════════

JSON output schema (--json):

json
{
  "from": "agent.yaml",
  "to": "agent.yaml.new",
  "scoreFrom": 100,
  "scoreTo": 85,
  "gradeFrom": "A",
  "gradeTo": "B",
  "netScoreChange": -15,
  "changes": [
    {
      "type": "removed",
      "property": "spec.guardrails.content_filter",
      "severity": "HIGH",
      "scoreImpact": -15,
      "description": "Content filtering removed — user input reaches model unfiltered"
    }
  ]
}

Score note: scoreFrom is always 100 (relative baseline). The diff measures drift magnitude, not absolute compliance. Run agentspec audit on each file for absolute scores.

Severity levels:

SeverityExamplesScore impact
HIGHGuardrail removed, API key reference removed−10 to −15
MEDIUMModel name/provider changed, eval hooks removed−5 to −8
LOWNew tool added, observability removed0 to −3

Exit codes: 0 = no drift (or drift without --exit-code), 1 = drift detected with --exit-code

agentspec evaluate

Run a declared JSONL evaluation dataset against a live agent and score actual outputs.

bash
agentspec evaluate <file> --url <url> --dataset <name>
agentspec evaluate agent.yaml --url http://localhost:4000 --dataset golden-qa
agentspec evaluate agent.yaml --url http://localhost:4000 --dataset golden-qa --json
agentspec evaluate agent.yaml --url http://localhost:4000 --dataset golden-qa --sample-size 20
agentspec evaluate agent.yaml --url http://localhost:4000 --dataset golden-qa --tag planning

Options:

  • --url <url>required: agent base URL (e.g. http://localhost:4000)
  • --dataset <name>required: dataset name from spec.evaluation.datasets[]
  • --sample-size <n> — run only N randomly-selected samples (default: all)
  • --tag <tag> — filter samples to those with a matching tag
  • --timeout <ms> — per-request timeout in milliseconds (default: 10000)
  • --json — output JSON instead of human-readable table

Dataset format (JSONL): one sample per line:

jsonl
{"input": "What exercises for bad knees?", "expected": "low-impact"}
{"input": "Design a 5-day plan", "expected": "rest day", "tags": ["planning"]}
  • input — sent as the user message to POST /v1/chat (or spec.api.chatEndpoint.path)
  • expected — substring that must appear in the response (case-insensitive)
  • tags — optional; used for --tag filtering

Scoring:

MetricDescription
pass_rateFraction of samples where the expected substring is found in the response (case-insensitive)

CI gate: if spec.evaluation.ciGate: true and spec.evaluation.thresholds.pass_rate is set, the command exits 1 when the measured pass rate is below the threshold.

Sample output:

  AgentSpec Evaluate — golden-qa
  ─────────────────────────────────────────────────────────────────
  Evaluating: golden-qa  42 samples  agent: http://localhost:4000

  ✓   1  "What exercises for bad knees?" → found "low-impact" [0.12s]
  ✓   2  "Design a 5-day plan" → found "rest day" [0.09s]
  ✗   7  "Can I train every day?" → expected "recovery" not found [0.14s]

  Results
    pass_rate     86%  (threshold: 80%)  PASS

  ciGate: PASS
  Exit code: 0

JSON output (--json):

json
{
  "dataset": "golden-qa",
  "agentUrl": "http://localhost:4000",
  "totalSamples": 42,
  "metrics": { "pass_rate": 0.86 },
  "threshold": 0.8,
  "ciGateResult": "PASS",
  "samples": [...]
}

Exit codes: 0 = evaluation complete (or ciGate not configured), 1 = ciGate threshold not met

See Probe Coverage for how agentspec evaluate fits into the evidence tier system.

agentspec migrate

Migrate an agent.yaml manifest to the latest schema version.

bash
agentspec migrate agent.yaml              # migrate in-place
agentspec migrate agent.yaml --dry-run   # preview changes, no files written
agentspec migrate agent.yaml -o out.yaml # write result to a different file

Options:

  • --dry-run — print the migrated manifest without writing any files
  • -o, --output <file> — write the result to a different file (default: overwrites input)

If the manifest is already at the latest version the command prints a success message and exits 0 without modifying any file.

Exit codes: 0 = already up-to-date or migrated successfully, 1 = no migration path found

Released under the Apache 2.0 License.