Skip to content

griddynamics/specflow

Repository files navigation

SpecFlow logo

Agentic Harness for Large-Scale Code Generation

License: MIT PyPI version Python 3.13 Publish to PyPI


AI agent harness for automated code generation and complexity estimation. Multiple deployable codebases built by multiple SOTA AI models. When their code complexity scores align, it proves the specs are complete.

SpecFlow is an AI agent harness that automates code generation, deployment, and testing through parallel AI agents in isolated, sandboxed execution environments.

Validator agents continuously assess, resume, and refine work until delivery standards are met.

SpecFlow iconography

SpecFlow-demo.mp4

Getting Started

Software

Requirement Notes
Docker Container runtime for the harness sandbox. Install Docker
uv Python package manager. Install via brew install uv or see docs
IDE SpecFlow is used as MCP in a IDE with agentic AI enabled: Cursor, Claude Code, Copilot, Gemini etc. This is the users project.

Keys and Tokens

Key Name in .env Notes
GitHub Personal Access Token GITHUB_TOKEN For disposable workspace repos. Scope: repo + read:user + workflow repo,read:user. Advan
P10Y API key P10Y_API_KEY Code complexity scoring. Setup guide
LLM provider key OPENROUTER_API_KEY or ANTHROPIC_API_KEY One key required. Get OpenRouter key (default) or Get Anthropic key

Installation

Few simple steps to get you going:

  • clone repo
    git clone https://github.com/griddynamics/specflow.git && cd specflow
  • install Specflow (includes the Terminal UI that guides you through onboarding)
    uv tool install --editable ./mcp_server
  • start Specflow app and follow instructions
    specflow tui
    

Important

Specflow Harness Sandbox is now running locally. Access it via MCP in your favourite Agentic client: copy-paste the content of .specflow-local/mcp-config.json.

Cursor
Cursor
Claude Code
Claude Code
Claude Desktop
Claude Desktop
GitHub Copilot
Copilot
Gemini CLI
Gemini CLI

...and any other IDE or client that supports the Model Context Protocol.

Usage

MCP is now ready to use in any project. Prompt your IDE agent to talk to the harness.

Let's say specification files are in specs directory, you can follow these steps:

  1. Start a new project in IDE and put your specs files into specs directory

    specs/
    |-- product-requirements.md
    |-- user-flows.pdf
    \-- acceptance-criteria.md
    
  2. Check your specification completeness using check_specification_completeness tool

    Use SpecFlow MCP to check specification completeness in specs directory
    
  3. Create a detailed plan using our run_planning tool

    Create implementation plan using SpecFlow MCP
    
  4. When you are happy with the plan, run generation using run_generation as above

    Run generation with SpecFlow MCP
    
  5. Generation usually takes many hours, use our TUI to monitor progress and receive Desktop Notifications:

    # Any terminal
    specflow tui

SpecFlow iconography

  1. When the generation has been completed, you can retrieve the results and P10Y reports from harness:

    Download outputs using Specflow MCP
    

    The rule of thumb is: if the P10Y score spread is low, then your specification is ready!

  2. Use the built-in prompt to compare the variants and identify their strong and weak sides, together with a plan to automatically assemble the best variant.

    use SpecFlow MCP prompt: specflow-compare-variants
    

MCP Tools

Tool Description
check_specification_completeness Analyze specs for gaps and contradictions (local)
run_planning Generate a phased implementation plan (local)
read_document Extract PDF/DOCX/PPTX/XLSX/CSV to markdown (local)
run_generation Upload and launch parallel codegen on the backend (2-8 hrs)
check_status Poll generation progress
download_outputs Download archived artifacts from a completed run
retry_generation Retry a failed generation

If you want to go deeper

SpecFlow Detailed Overview

SpecFlow.Detailed.Look.mp4

Full MCP config and usage: MCP_USER.md

Full MCP API reference: docs/mcp/API_REFERENCE.md

Detailed SpecFlow harness instructions: QUICKSTART.md

Important

AI agents work in scratchpad repos that are reset before each run — we create them for you. **Do not point SpecFlow at repositories with code or history you want to keep. ** The managed SpecFlow service is for Grid Dynamics employees only. Open-source users should run the local quickstart.

Documentation

Document Description
QUICKSTART.md Local setup and first run
CLAUDE.md Development protocol and STEEL commandments
docs/ARCHITECTURE.md System design and data flow
docs/mcp/API_REFERENCE.md MCP tool reference
docs/backend/DEVELOPMENT.md Backend development guide
docs/backend/API_REFERENCE.md REST API reference
docs/operations/TROUBLESHOOTING.md Troubleshooting guide
docs/IDE-SETUP.md IDE configuration (Cursor + Claude Code)

License

MIT — Copyright (c) 2024 Grid Dynamics International, Inc.

(back to top)

About

Agentic Harness for Large-Scale Code Generation

Resources

License

Stars

Watchers

Forks

Packages

 
 
 

Contributors

Languages