Fallback Models
Fallback models provide automatic failover to alternative models when API errors or content errors occur with your primary model. This feature improves pipeline reliability and reduces failures due to temporary API issues or model unavailability.
Overview
When configured, DocETL will automatically try fallback models in sequence if:
- The primary model encounters an API error (rate limits, service unavailability, content warning errors, etc.)
- The primary model returns invalid content that cannot be parsed
- The primary model fails to respond within expected timeframes
This ensures your pipelines continue running even when individual models experience issues.
Configuration
Fallback models are configured at the global level in your pipeline YAML file. You can configure separate fallback models for:
- Completion/Chat operations: Used by
map,reduce,resolve,filter, and other LLM-powered operations - Embedding operations: Used by operations that generate embeddings (e.g.,
cluster,rank)
Basic Configuration
The simplest way to configure fallback models is to provide a list of model names:
# Default language model for all operations
default_model: gpt-4o-mini
# Fallback models for completion/chat operations
fallback_models:
- gpt-3.5-turbo
- claude-3-haiku-20240307
# Fallback models for embedding operations
fallback_embedding_models:
- text-embedding-3-small
- text-embedding-ada-002
Models will be tried in the order specified. If the primary model fails, DocETL will automatically try the first fallback model, then the second, and so on.
Advanced Configuration
For more control, you can specify additional LiteLLM parameters for each fallback model:
default_model: gpt-4o-mini
# Fallback models with custom parameters
fallback_models:
- model_name: gpt-3.5-turbo
litellm_params:
temperature: 0.0
max_tokens: 2000
- model_name: claude-3-haiku-20240307
litellm_params:
temperature: 0.0
# Fallback embedding models
fallback_embedding_models:
- model_name: text-embedding-3-small
litellm_params: {}
- model_name: text-embedding-ada-002
litellm_params: {}
How It Works
When an operation uses a model (either the default_model or an operation-specific model), DocETL will:
- Try the primary model first: The operation's specified model (or
default_model) is attempted first - Fallback on error: If an API error or content parsing error occurs, DocETL automatically tries the first fallback model
- Continue through fallbacks: If the first fallback also fails, it tries the next fallback model in sequence
- Fail only if all models fail: The operation only fails if all models (primary + all fallbacks) fail
Example: Complete Pipeline with Fallback Models
Here's a complete example showing how to use fallback models in a pipeline:
datasets:
example_dataset:
type: file
path: example_data/example.json
# Default language model for all operations unless overridden
default_model: gpt-4o-mini
# Fallback models for completion/chat operations
# Models will be tried in order when API errors or content errors occur
fallback_models:
# First fallback model
- model_name: gpt-3.5-turbo
litellm_params:
temperature: 0.0
# Second fallback model
- model_name: claude-3-haiku-20240307
litellm_params:
temperature: 0.0
# Fallback models for embedding operations
# Separate configuration for embedding model fallbacks
fallback_embedding_models:
- model_name: text-embedding-3-small
litellm_params: {}
- model_name: text-embedding-ada-002
litellm_params: {}
operations:
- name: example_map
type: map
prompt: "Extract key information from: {{ input.contents }}"
output:
schema:
extracted_info: "str"
pipeline:
steps:
- name: process_data
input: example_dataset
operations:
- example_map
output:
type: file
path: example_output.json