AI agent harness for automated code generation and complexity estimation. Multiple deployable codebases built by multiple SOTA AI models. When their code complexity scores align, it proves the specs are complete.
SpecFlow is an AI agent harness that automates code generation, deployment, and testing through parallel AI agents in isolated, sandboxed execution environments.
Validator agents continuously assess, resume, and refine work until delivery standards are met.
SpecFlow-demo.mp4
| Requirement | Notes |
|---|---|
| Docker | Container runtime for the harness sandbox. Install Docker |
uv |
Python package manager. Install via brew install uv or see docs |
| IDE | SpecFlow is used as MCP in a IDE with agentic AI enabled: Cursor, Claude Code, Copilot, Gemini etc. This is the users project. |
| Key | Name in .env |
Notes |
|---|---|---|
| GitHub Personal Access Token | GITHUB_TOKEN |
For disposable workspace repos. Scope: repo + read:user + workflow repo,read:user. Advan |
| P10Y API key | P10Y_API_KEY |
Code complexity scoring. Setup guide |
| LLM provider key | OPENROUTER_API_KEY or ANTHROPIC_API_KEY |
One key required. Get OpenRouter key (default) or Get Anthropic key |
Few simple steps to get you going:
- clone repo
git clone https://github.com/griddynamics/specflow.git && cd specflow
- install Specflow (includes the Terminal UI that guides you through onboarding)
uv tool install --editable ./mcp_server
- start Specflow app and follow instructions
specflow tui
Important
Specflow Harness Sandbox is now running locally.
Access it via MCP in your favourite Agentic client: copy-paste the content of .specflow-local/mcp-config.json.
Cursor |
Claude Code |
Claude Desktop |
Copilot |
Gemini CLI |
...and any other IDE or client that supports the Model Context Protocol.
MCP is now ready to use in any project. Prompt your IDE agent to talk to the harness.
Let's say specification files are in specs directory, you can follow these steps:
-
Start a new project in IDE and put your specs files into
specsdirectoryspecs/ |-- product-requirements.md |-- user-flows.pdf \-- acceptance-criteria.md -
Check your specification completeness using
check_specification_completenesstoolUse SpecFlow MCP to check specification completeness in specs directory -
Create a detailed plan using our
run_planningtoolCreate implementation plan using SpecFlow MCP -
When you are happy with the plan, run generation using
run_generationas aboveRun generation with SpecFlow MCP -
Generation usually takes many hours, use our TUI to monitor progress and receive Desktop Notifications:
# Any terminal specflow tui
-
When the generation has been completed, you can retrieve the results and P10Y reports from harness:
Download outputs using Specflow MCPThe rule of thumb is: if the P10Y score spread is low, then your specification is ready!
-
Use the built-in prompt to compare the variants and identify their strong and weak sides, together with a plan to automatically assemble the best variant.
use SpecFlow MCP prompt: specflow-compare-variants
| Tool | Description |
|---|---|
check_specification_completeness |
Analyze specs for gaps and contradictions (local) |
run_planning |
Generate a phased implementation plan (local) |
read_document |
Extract PDF/DOCX/PPTX/XLSX/CSV to markdown (local) |
run_generation |
Upload and launch parallel codegen on the backend (2-8 hrs) |
check_status |
Poll generation progress |
download_outputs |
Download archived artifacts from a completed run |
retry_generation |
Retry a failed generation |
SpecFlow.Detailed.Look.mp4
Full MCP config and usage: MCP_USER.md
Full MCP API reference: docs/mcp/API_REFERENCE.md
Detailed SpecFlow harness instructions: QUICKSTART.md
Important
AI agents work in scratchpad repos that are reset before each run — we create them for you. **Do not point SpecFlow at repositories with code or history you want to keep. ** The managed SpecFlow service is for Grid Dynamics employees only. Open-source users should run the local quickstart.
| Document | Description |
|---|---|
| QUICKSTART.md | Local setup and first run |
| CLAUDE.md | Development protocol and STEEL commandments |
| docs/ARCHITECTURE.md | System design and data flow |
| docs/mcp/API_REFERENCE.md | MCP tool reference |
| docs/backend/DEVELOPMENT.md | Backend development guide |
| docs/backend/API_REFERENCE.md | REST API reference |
| docs/operations/TROUBLESHOOTING.md | Troubleshooting guide |
| docs/IDE-SETUP.md | IDE configuration (Cursor + Claude Code) |
MIT — Copyright (c) 2024 Grid Dynamics International, Inc.

