Documentation
Use the CLI locally and the GitHub Action in CI. Node/TypeScript supported today. Python/Go coming soon.
Quick Start
1. Install PromptProof
npm
install -g promptproof-cli
Or use npx (no global install):
npx
promptproof-cli@latest eval -c promptproof.yaml --out report
2. Initialize Your Project
promptproof init
This creates a .promptproof.yml
config file with example tests.
3. Write Your First Contract
schema_version: pp.v1 fixtures: fixtures/support-replies/outputs.jsonl checks: - id: no_pii type: regex_forbidden target: output.text patterns: - "[A-Z0-9._%+-]+@[A-Z0-9.-]+\.[A-Z]{2,}"
4. Run Evaluation
promptproof eval -c promptproof.yaml --out report --format html
Test Structure
Tests are defined in promptproof.yaml
and executed against recorded fixtures.
schema_version: pp.v1 fixtures: fixtures/support-replies/outputs.jsonl checks: - id: response_schema type: json_schema target: output.json schema: type: object required: [status, message]
Assertions (Checks)
Built-in checks: json_schema, regex_forbidden, regex_required, numeric_bounds, string_contains, string_equals, list_equality, set_equality, file_diff, custom_fn.
Fixtures
Fixtures are sanitized JSONL records produced by the SDK (or your ingestion) under fixtures/<suite>/outputs.jsonl
. CI replays these offline for deterministic evaluation.
Policies (promptproof.yaml)
budgets: cost_usd_per_run_max: 0.50 latency_ms_p95_max: 2000 mode: fail
JSON Schema
checks: - id: response_schema type: json_schema target: output.json schema: type: object required: [status, data] properties: status: { type: string, enum: [success, error] }
Regex (Forbidden / Required)
checks: - id: no_pii type: regex_forbidden target: output.text patterns: - "[A-Z0-9._%+-]+@[A-Z0-9.-]+\.[A-Z]{2,}" - id: disclaimer_en type: regex_required target: output.text pattern: "We cannot share personal contact information\."
Numeric Bounds
checks: - id: confidence_bounds type: numeric_bounds target: output.json.confidence min: 0 max: 1
String Contains / Equals
checks: - id: has_disclaimer type: string_contains target: output.text expected: "We cannot guarantee" - id: exact_title type: string_equals target: output.json.title expected: "Summary"
List / Set Equality
checks: - id: response_list_exact type: list_equality target: output.json.items expected: ["step1", "step2", "step3"] order_sensitive: true - id: tags_any_order type: set_equality target: output.json.tags expected: ["a", "b", "c"]
File Diff
checks: - id: file_diff type: file_diff expected_file: fixtures/before.txt actual_file: fixtures/after.txt
Custom Function
checks: - id: valid_tool_use type: custom_fn target: output.json fn: | const allowed = ['send_email', 'check_calendar'] return allowed.includes(value.tool)
CLI Commands
# Evaluate fixtures promptproof eval -c promptproof.yaml # Regression comparison against baseline promptproof eval -c promptproof.yaml --regress # Control flakiness promptproof eval -c promptproof.yaml --seed 42 --runs 3 # Create/promote snapshot baseline promptproof snapshot promptproof.yaml --promote # Initialize project promptproof init
Examples
See the demo project and fixtures in the main repository for real-world cases.
CI/CD Setup (GitHub Actions)
Use the marketplace action to evaluate fixtures on every PR. No live model calls in CI.
name: PromptProof on: [pull_request] jobs: eval: runs-on: ubuntu-latest steps: - uses: actions/checkout@v4 - uses: geminimir/promptproof-action@v0 with: config: promptproof.yaml format: html mode: gate
Node.js SDK (Recording)
Automatically record LLM calls to fixtures for deterministic CI replay.
npm install promptproof-sdk-node@beta // OpenAI import OpenAI from 'openai' import { withPromptProofOpenAI } from 'promptproof-sdk-node/openai' const ai = withPromptProofOpenAI(new OpenAI({ apiKey: process.env.OPENAI_API_KEY }), { suite: 'support-replies' })
Demo Project
A complete example with SDK, CLI, and Action wired.
Open demo repository