Understanding MOAR Results

What MOAR outputs and how to interpret the results.

Python API Results

frame.optimize() returns an optimized Frame, ready to run with .collect() or .write_json(). The full search results are available on it as .search_results, a MOARResult object.

MOARResult

optimized = frame.optimize(eval_fn=evaluate, metric_key="score")
rows = optimized.collect()   # run the optimized pipeline

results = optimized.search_results
results.best()      # OptimizedPipeline with highest accuracy on the frontier
results.cheapest()  # OptimizedPipeline with lowest cost on the frontier
results.frontier    # list[OptimizedPipeline] — all Pareto-optimal solutions
results.to_df()     # pandas DataFrame of all explored plans

Method / Property	Return Type	Description
`best()`	`OptimizedPipeline`	The frontier solution with the highest accuracy
`cheapest()`	`OptimizedPipeline`	The frontier solution with the lowest cost
`frontier`	`list[OptimizedPipeline]`	All Pareto-optimal solutions, sorted by cost
`to_df()`	`pandas.DataFrame`	DataFrame of all explored plans with cost, accuracy, and metadata

OptimizedPipeline

Each result on the frontier is an OptimizedPipeline that you can inspect and run directly:

best = optimized.search_results.best()

# Inspect
print(best.cost)        # Estimated cost per run
print(best.accuracy)    # Evaluation metric score
print(best.yaml_path)   # Path to the optimized YAML file
print(best.on_frontier) # True if on the Pareto frontier

# Run
best.run()              # Execute the optimized pipeline

# Access the underlying DSLRunner
best.pipeline           # DSLRunner instance

Property / Method	Type	Description
`pipeline`	`DSLRunner`	The underlying pipeline runner
`cost`	`float`	Estimated cost per run
`accuracy`	`float`	Evaluation metric score
`yaml_path`	`str`	Path to the optimized YAML configuration
`on_frontier`	`bool`	Whether this plan is on the Pareto frontier
`run()`	`float`	Execute the pipeline; returns execution cost

Working with Results

# Choose based on your priorities
optimized = frame.optimize(eval_fn=evaluate, metric_key="score")
results = optimized.search_results

# Highest accuracy
best = results.best()
print(f"Best accuracy: {best.accuracy}, cost: ${best.cost:.4f}")
best.run()

# Lowest cost
cheap = results.cheapest()
print(f"Cheapest cost: ${cheap.cost:.4f}, accuracy: {cheap.accuracy}")

# Explore the full frontier
for plan in results.frontier:
    print(f"Cost: ${plan.cost:.4f}, Accuracy: {plan.accuracy}")

# Analyze all explored configurations as a DataFrame
df = results.to_df()
print(df[["cost", "accuracy", "on_frontier"]].sort_values("accuracy", ascending=False))

CLI Output Files

After running docetl build pipeline.yaml, you'll find several files in your save_dir:

experiment_summary.json — High-level summary
pareto_frontier.json — Optimal solutions
evaluation_metrics.json — Detailed evaluation results
pipeline_*.yaml — Optimized pipeline configurations

experiment_summary.json

High-level summary of the optimization run:

{
  "optimizer": "moar",
  "input_pipeline": "pipeline.yaml",
  "rewrite_agent_model": "gpt-5.1",
  "max_iterations": 40,
  "save_dir": "results/moar_optimization",
  "dataset": "transcripts",
  "start_time": "2024-01-15T10:30:00",
  "end_time": "2024-01-15T11:15:00",
  "duration_seconds": 2700,
  "num_best_nodes": 5,
  "total_nodes_explored": 120,
  "total_search_cost": 15.50
}

Key Metrics

num_best_nodes: Number of solutions on the Pareto frontier
total_nodes_explored: Total configurations tested
total_search_cost: Total cost of the optimization search

pareto_frontier.json

List of Pareto-optimal solutions (the cost-accuracy frontier):

[
  {
    "node_id": 5,
    "yaml_path": "results/moar_optimization/pipeline_5.yaml",
    "cost": 0.05,
    "accuracy": 0.92
  },
  {
    "node_id": 12,
    "yaml_path": "results/moar_optimization/pipeline_12.yaml",
    "cost": 0.08,
    "accuracy": 0.95
  }
]

Choosing a Solution

Review the Pareto frontier to find solutions that match your priorities:

Low cost priority: Choose solutions with lower cost
High accuracy priority: Choose solutions with higher accuracy
Balanced: Choose solutions in the middle

Each solution includes a yaml_path pointing to the optimized pipeline configuration.

evaluation_metrics.json

Detailed evaluation results for all explored configurations. This file contains comprehensive metrics for every pipeline configuration tested during optimization.

Pipeline Configurations

Each solution on the Pareto frontier has a corresponding YAML file (e.g., pipeline_5.yaml) containing the optimized pipeline configuration. You can:

Review the changes MOAR made
Test the pipeline on your full dataset
Use it in production

Next Steps

After reviewing the results:

Choose a solution — Use optimized.search_results.best() / .cheapest() in Python, or review pareto_frontier.json from the CLI
Run the chosen pipeline — Call .run() on the OptimizedPipeline, or run the YAML with docetl run
Integrate into production — Use the optimized configuration

Success

You now have multiple optimized pipeline options to choose from, each representing a different point on the cost-accuracy trade-off curve.