GitHub - griddynamics/specflow: Agentic Harness for Large-Scale Code Generation

Agentic Harness for Large-Scale Code Generation

AI agent harness for automated code generation and complexity estimation. Multiple deployable codebases built by multiple SOTA AI models. When their code complexity scores align, it proves the specs are complete.

SpecFlow is an AI agent harness that automates code generation, deployment, and testing through parallel AI agents in isolated, sandboxed execution environments.

Validator agents continuously assess, resume, and refine work until delivery standards are met.

SpecFlow-demo.mp4

Getting Started

Software

Requirement	Notes
Docker	Container runtime for the harness sandbox. Install Docker
`uv`	Python package manager. Install via `brew install uv` or see docs
IDE	SpecFlow is used as MCP in a IDE with agentic AI enabled: Cursor, Claude Code, Copilot, Gemini etc. This is the users project.

Keys and Tokens

Key	Name in `.env`	Notes
GitHub Personal Access Token	`GITHUB_TOKEN`	For disposable workspace repos. Scope: `repo` + `read:user` + `workflow` `repo,read:user`. Advan
P10Y API key	`P10Y_API_KEY`	Code complexity scoring. Setup guide
LLM provider key	`OPENROUTER_API_KEY` or `ANTHROPIC_API_KEY`	One key required. Get OpenRouter key (default) or Get Anthropic key

Installation

Few simple steps to get you going:

clone repo

git clone https://github.com/griddynamics/specflow.git && cd specflow

install Specflow (includes the Terminal UI that guides you through onboarding)
```
uv tool install --editable ./mcp_server
```
start Specflow app and follow instructions
```
specflow tui
```

Important

Specflow Harness Sandbox is now running locally. Access it via MCP in your favourite Agentic client: copy-paste the content of .specflow-local/mcp-config.json.

Cursor

Claude Code

Claude Desktop

Copilot

Gemini CLI

...and any other IDE or client that supports the Model Context Protocol.

Usage

MCP is now ready to use in any project. Prompt your IDE agent to talk to the harness.

Let's say specification files are in specs directory, you can follow these steps:

Start a new project in IDE and put your specs files into specs directory

specs/
|-- product-requirements.md
|-- user-flows.pdf
\-- acceptance-criteria.md

Check your specification completeness using check_specification_completeness tool
```
Use SpecFlow MCP to check specification completeness in specs directory
```
Create a detailed plan using our run_planning tool
```
Create implementation plan using SpecFlow MCP
```
When you are happy with the plan, run generation using run_generation as above
```
Run generation with SpecFlow MCP
```
Generation usually takes many hours, use our TUI to monitor progress and receive Desktop Notifications:
```
# Any terminal
specflow tui
```

When the generation has been completed, you can retrieve the results and P10Y reports from harness:
```
Download outputs using Specflow MCP
```
The rule of thumb is: if the P10Y score spread is low, then your specification is ready!
Use the built-in prompt to compare the variants and identify their strong and weak sides, together with a plan to automatically assemble the best variant.
```
use SpecFlow MCP prompt: specflow-compare-variants
```

MCP Tools

Tool	Description
`check_specification_completeness`	Analyze specs for gaps and contradictions (local)
`run_planning`	Generate a phased implementation plan (local)
`read_document`	Extract PDF/DOCX/PPTX/XLSX/CSV to markdown (local)
`run_generation`	Upload and launch parallel codegen on the backend (2-8 hrs)
`check_status`	Poll generation progress
`download_outputs`	Download archived artifacts from a completed run
`retry_generation`	Retry a failed generation

If you want to go deeper

SpecFlow Detailed Overview

SpecFlow.Detailed.Look.mp4

Full MCP config and usage: MCP_USER.md

Full MCP API reference: docs/mcp/API_REFERENCE.md

Detailed SpecFlow harness instructions: QUICKSTART.md

Important

AI agents work in scratchpad repos that are reset before each run — we create them for you. **Do not point SpecFlow at repositories with code or history you want to keep. ** The managed SpecFlow service is for Grid Dynamics employees only. Open-source users should run the local quickstart.

Documentation

Document	Description
QUICKSTART.md	Local setup and first run
CLAUDE.md	Development protocol and STEEL commandments
docs/ARCHITECTURE.md	System design and data flow
docs/mcp/API_REFERENCE.md	MCP tool reference
docs/backend/DEVELOPMENT.md	Backend development guide
docs/backend/API_REFERENCE.md	REST API reference
docs/operations/TROUBLESHOOTING.md	Troubleshooting guide
docs/IDE-SETUP.md	IDE configuration (Cursor + Claude Code)

License

(back to top)

Name		Name	Last commit message	Last commit date
Latest commit History 10 Commits
.claude		.claude
.cursor		.cursor
.github		.github
agents		agents
backend		backend
ci		ci
docs		docs
mcp_server		mcp_server
scripts		scripts
.dockerignore		.dockerignore
.env.quickstart.example		.env.quickstart.example
.gitignore		.gitignore
.gitleaks.toml		.gitleaks.toml
.secret-scan-exclude-paths.txt		.secret-scan-exclude-paths.txt
CLAUDE.md		CLAUDE.md
LICENSE		LICENSE
MCP_USER.md		MCP_USER.md
Makefile		Makefile
QUICKSTART.md		QUICKSTART.md
README.md		README.md
docker-compose.yml		docker-compose.yml
e2e-workspace-config.example.json		e2e-workspace-config.example.json
specflow-init.sh		specflow-init.sh

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Repository files navigation

Agentic Harness for Large-Scale Code Generation

Getting Started

Software

Keys and Tokens

Installation

Usage

MCP Tools

If you want to go deeper

SpecFlow Detailed Overview

Documentation

License

About

Uh oh!

Releases 1

Packages

Uh oh!

Contributors

Uh oh!

Languages

Uh oh!

Folders and files

Latest commit

History

Repository files navigation

Agentic Harness for Large-Scale Code Generation

Getting Started

Software

Keys and Tokens

Installation

Usage

MCP Tools

If you want to go deeper

SpecFlow Detailed Overview

Documentation

License

About

Resources

License

Uh oh!

Stars

Watchers

Forks

Releases 1

Packages 0

Uh oh!

Contributors

Uh oh!

Languages

Packages