feat: add repeat until workflow stages#773
Conversation
|
Fern preview: https://nvidia-preview-pr-773.docs.buildwithfern.com/nemo/datadesigner
|
3e1e30f to
022cac6
Compare
Greptile SummaryThis PR introduces a bounded
|
| Filename | Overview |
|---|---|
| packages/data-designer/src/data_designer/interface/composite_workflow.py | Core change: adds RepeatUntil policy, _run_stage_until_append/discard methods, exhaustion handling, and trim logic. Logic for append/discard modes, empty-partial paths, fingerprinting, and metadata is correct and consistent. |
| packages/data-designer/src/data_designer/interface/init.py | Adds RepeatUntil, RepeatUntilMode, and RepeatUntilExhaustion to public lazy-import registry and TYPE_CHECKING block. Straightforward and correct. |
| packages/data-designer/tests/interface/test_composite_workflow.py | Adds seven integration tests covering append accumulation and trim, non-empty partial, empty partial, raised exhaustion, discard retry, discard resume warning, and processor-output selection. All major code paths are exercised. |
| fern/versions/latest/pages/concepts/workflow-chaining.mdx | Adds a new "Repeating until a filtered count" section with a working code example and accurate prose describing append/discard mode semantics and exhaustion options. |
Reviews (1): Last reviewed commit: "refactor: simplify repeat until append r..." | Re-trigger Greptile
|
Thanks for putting this together, @andreatgretel — this is a clean, well-bounded addition and the test coverage is genuinely thorough. SummaryThis PR adds a bounded FindingsWarnings — Worth addressing
Suggestions — Take it or leave it
What Looks Good
Structural ImpactReviewer interpretation: The reported HIGH risk / 103 import-direction violations are AST false positives: nearly all are Backward compatibility: Raw graphify analysisStructural Impact (graphify, 2.2s)Risk: HIGH (103 import direction violation(s))
Import Direction Violations (103)Legal direction: interface -> engine -> config
High-Connectivity Changes
Cross-Package Dependencies
VerdictNeeds changes — only one Warning (the missing This review was generated by an AI assistant. |
📋 Summary
This PR adds a bounded
RepeatUntilstage policy so workflow chaining can keep generating candidates until filtered output reaches a target count. It covers the exact-N rejection-sampling case from the Slack thread while keeping runs bounded, resumable in append mode, and visible in workflow metadata.🔗 Related Issue
N/A
🔄 Changes
RepeatUntil,RepeatUntilMode, andRepeatUntilExhaustionAPI exports.CompositeWorkflow.add_stageto acceptrepeat_untilwith append and discard modes.output_recordswhen the repeat target is satisfied.on_exhausted="return_partial"complete empty when no rows pass, matchingallow_empty=Truedownstream skip behavior.workflow-chaining.mdx, including append accumulation and cap semantics.🔍 Attention Areas
composite_workflow.py- Adds a public stage orchestration policy and changes stage metadata/fingerprint behavior for repeat runs.🎬 E2E demo: exact-N filtering with workflow chaining
num_recordsis the per-attempt growth size.appendmode, each iteration requests the cumulative stage size and reruns the filter hook over the accumulated stage output.output_recordsselected rows once the target is reached, or the best partial output when bounded exhaustion is configured withreturn_partial.return_partial, the stage completes empty and downstream stages are skipped.🧪 Testing
.venv/bin/ruff check --fix ..venv/bin/ruff format ..venv/bin/pytest packages/data-designer/tests/interface/test_composite_workflow.py -qPATH=/Users/amanoel/.local/bin:$PATH make check-fern-docs-locally✅ Checklist