Documentation
Use the CLI locally and the GitHub Action in CI. Node/TypeScript supported today. Python/Go coming soon.
Quick Start
1. Install PromptProof
npm install -g promptproof-cliOr use npx (no global install):
npx promptproof-cli@latest eval -c promptproof.yaml --out report2. Initialize Your Project
promptproof initThis creates a .promptproof.yml config file with example tests.
3. Write Your First Contract
schema_version: pp.v1
fixtures: fixtures/support-replies/outputs.jsonl
checks:
- id: no_pii
type: regex_forbidden
target: output.text
patterns:
- "[A-Z0-9._%+-]+@[A-Z0-9.-]+\.[A-Z]{2,}"4. Run Evaluation
promptproof eval -c promptproof.yaml --out report --format htmlTest Structure
Tests are defined in promptproof.yaml and executed against recorded fixtures.
schema_version: pp.v1
fixtures: fixtures/support-replies/outputs.jsonl
checks:
- id: response_schema
type: json_schema
target: output.json
schema:
type: object
required: [status, message]Assertions (Checks)
Built-in checks: json_schema, regex_forbidden, regex_required, numeric_bounds, string_contains, string_equals, list_equality, set_equality, file_diff, custom_fn.
Fixtures
Fixtures are sanitized JSONL records produced by the SDK (or your ingestion) under fixtures/<suite>/outputs.jsonl. CI replays these offline for deterministic evaluation.
Policies (promptproof.yaml)
budgets: cost_usd_per_run_max: 0.50 latency_ms_p95_max: 2000 mode: fail
JSON Schema
checks:
- id: response_schema
type: json_schema
target: output.json
schema:
type: object
required: [status, data]
properties:
status: { type: string, enum: [success, error] }Regex (Forbidden / Required)
checks:
- id: no_pii
type: regex_forbidden
target: output.text
patterns:
- "[A-Z0-9._%+-]+@[A-Z0-9.-]+\.[A-Z]{2,}"
- id: disclaimer_en
type: regex_required
target: output.text
pattern: "We cannot share personal contact information\."Numeric Bounds
checks:
- id: confidence_bounds
type: numeric_bounds
target: output.json.confidence
min: 0
max: 1String Contains / Equals
checks:
- id: has_disclaimer
type: string_contains
target: output.text
expected: "We cannot guarantee"
- id: exact_title
type: string_equals
target: output.json.title
expected: "Summary"List / Set Equality
checks:
- id: response_list_exact
type: list_equality
target: output.json.items
expected: ["step1", "step2", "step3"]
order_sensitive: true
- id: tags_any_order
type: set_equality
target: output.json.tags
expected: ["a", "b", "c"]File Diff
checks:
- id: file_diff
type: file_diff
expected_file: fixtures/before.txt
actual_file: fixtures/after.txtCustom Function
checks:
- id: valid_tool_use
type: custom_fn
target: output.json
fn: |
const allowed = ['send_email', 'check_calendar']
return allowed.includes(value.tool)CLI Commands
# Evaluate fixtures promptproof eval -c promptproof.yaml # Regression comparison against baseline promptproof eval -c promptproof.yaml --regress # Control flakiness promptproof eval -c promptproof.yaml --seed 42 --runs 3 # Create/promote snapshot baseline promptproof snapshot promptproof.yaml --promote # Initialize project promptproof init
Examples
See the demo project and fixtures in the main repository for real-world cases.
CI/CD Setup (GitHub Actions)
Use the marketplace action to evaluate fixtures on every PR. No live model calls in CI.
name: PromptProof
on: [pull_request]
jobs:
eval:
runs-on: ubuntu-latest
steps:
- uses: actions/checkout@v4
- uses: geminimir/promptproof-action@v0
with:
config: promptproof.yaml
format: html
mode: gateNode.js SDK (Recording)
Automatically record LLM calls to fixtures for deterministic CI replay.
npm install promptproof-sdk-node@beta
// OpenAI
import OpenAI from 'openai'
import { withPromptProofOpenAI } from 'promptproof-sdk-node/openai'
const ai = withPromptProofOpenAI(new OpenAI({ apiKey: process.env.OPENAI_API_KEY }), { suite: 'support-replies' })Demo Project
A complete example with SDK, CLI, and Action wired.
Open demo repository