Skip to content

Understanding MOAR Results

What MOAR outputs and how to interpret the results.

Python API Results

frame.optimize() returns an optimized Frame, ready to run with .collect() or .write_json(). The full search results are available on it as .search_results, a MOARResult object.

MOARResult

optimized = frame.optimize(eval_fn=evaluate, metric_key="score")
rows = optimized.collect()   # run the optimized pipeline

results = optimized.search_results
results.best()      # OptimizedPipeline with highest accuracy on the frontier
results.cheapest()  # OptimizedPipeline with lowest cost on the frontier
results.frontier    # list[OptimizedPipeline] — all Pareto-optimal solutions
results.to_df()     # pandas DataFrame of all explored plans
Method / Property Return Type Description
best() OptimizedPipeline The frontier solution with the highest accuracy
cheapest() OptimizedPipeline The frontier solution with the lowest cost
frontier list[OptimizedPipeline] All Pareto-optimal solutions, sorted by cost
to_df() pandas.DataFrame DataFrame of all explored plans with cost, accuracy, and metadata

OptimizedPipeline

Each result on the frontier is an OptimizedPipeline that you can inspect and run directly:

best = optimized.search_results.best()

# Inspect
print(best.cost)        # Estimated cost per run
print(best.accuracy)    # Evaluation metric score
print(best.yaml_path)   # Path to the optimized YAML file
print(best.on_frontier) # True if on the Pareto frontier

# Run
best.run()              # Execute the optimized pipeline

# Access the underlying DSLRunner
best.pipeline           # DSLRunner instance
Property / Method Type Description
pipeline DSLRunner The underlying pipeline runner
cost float Estimated cost per run
accuracy float Evaluation metric score
yaml_path str Path to the optimized YAML configuration
on_frontier bool Whether this plan is on the Pareto frontier
run() float Execute the pipeline; returns execution cost

Working with Results

# Choose based on your priorities
optimized = frame.optimize(eval_fn=evaluate, metric_key="score")
results = optimized.search_results

# Highest accuracy
best = results.best()
print(f"Best accuracy: {best.accuracy}, cost: ${best.cost:.4f}")
best.run()

# Lowest cost
cheap = results.cheapest()
print(f"Cheapest cost: ${cheap.cost:.4f}, accuracy: {cheap.accuracy}")

# Explore the full frontier
for plan in results.frontier:
    print(f"Cost: ${plan.cost:.4f}, Accuracy: {plan.accuracy}")

# Analyze all explored configurations as a DataFrame
df = results.to_df()
print(df[["cost", "accuracy", "on_frontier"]].sort_values("accuracy", ascending=False))

CLI Output Files

After running docetl build pipeline.yaml, you'll find several files in your save_dir:

  • experiment_summary.json — High-level summary
  • pareto_frontier.json — Optimal solutions
  • evaluation_metrics.json — Detailed evaluation results
  • pipeline_*.yaml — Optimized pipeline configurations

experiment_summary.json

High-level summary of the optimization run:

{
  "optimizer": "moar",
  "input_pipeline": "pipeline.yaml",
  "rewrite_agent_model": "gpt-5.1",
  "max_iterations": 40,
  "save_dir": "results/moar_optimization",
  "dataset": "transcripts",
  "start_time": "2024-01-15T10:30:00",
  "end_time": "2024-01-15T11:15:00",
  "duration_seconds": 2700,
  "num_best_nodes": 5,
  "total_nodes_explored": 120,
  "total_search_cost": 15.50
}

Key Metrics

  • num_best_nodes: Number of solutions on the Pareto frontier
  • total_nodes_explored: Total configurations tested
  • total_search_cost: Total cost of the optimization search

pareto_frontier.json

List of Pareto-optimal solutions (the cost-accuracy frontier):

[
  {
    "node_id": 5,
    "yaml_path": "results/moar_optimization/pipeline_5.yaml",
    "cost": 0.05,
    "accuracy": 0.92
  },
  {
    "node_id": 12,
    "yaml_path": "results/moar_optimization/pipeline_12.yaml",
    "cost": 0.08,
    "accuracy": 0.95
  }
]

Choosing a Solution

Review the Pareto frontier to find solutions that match your priorities:

  • Low cost priority: Choose solutions with lower cost
  • High accuracy priority: Choose solutions with higher accuracy
  • Balanced: Choose solutions in the middle

Each solution includes a yaml_path pointing to the optimized pipeline configuration.

evaluation_metrics.json

Detailed evaluation results for all explored configurations. This file contains comprehensive metrics for every pipeline configuration tested during optimization.

Pipeline Configurations

Each solution on the Pareto frontier has a corresponding YAML file (e.g., pipeline_5.yaml) containing the optimized pipeline configuration. You can:

  1. Review the changes MOAR made
  2. Test the pipeline on your full dataset
  3. Use it in production

Next Steps

After reviewing the results:

  1. Choose a solution — Use optimized.search_results.best() / .cheapest() in Python, or review pareto_frontier.json from the CLI
  2. Run the chosen pipeline — Call .run() on the OptimizedPipeline, or run the YAML with docetl run
  3. Integrate into production — Use the optimized configuration

Success

You now have multiple optimized pipeline options to choose from, each representing a different point on the cost-accuracy trade-off curve.