Parallel Map Operation
The Parallel Map operation applies multiple independent transformations to each item concurrently, maintaining a 1:1 input-to-output ratio. It differs from Map in that you define multiple prompts that run concurrently, without explicitly creating a DAG.
flowchart LR
d["doc"] --> p1["prompt A"] --> o["doc + field a + field b"]
d --> p2["prompt B"] --> o
Configuration
- Each prompt generates specific fields of the output; prompts run concurrently.
- The output schema must include all fields generated by the individual prompts; results are combined into a single output item per input.
Required Parameters
| Parameter | Description |
|---|---|
name |
A unique name for the operation |
type |
Must be set to "parallel_map" |
prompts |
A list of prompt configurations (see below) |
output |
Schema definition for the combined output from all prompts |
Each prompt configuration in the prompts list should contain:
prompt: The prompt template to use for the transformationoutput_keys: List of keys that this prompt will generatemodel(optional): The language model to use for this specific promptgleaning(optional): Advanced validation settings for this prompt (see Per-Prompt Gleaning section below)
Optional Parameters
| Parameter | Description | Default |
|---|---|---|
model |
The default language model to use | Falls back to default_model |
optimize |
Flag to enable operation optimization | True |
recursively_optimize |
Flag to enable recursive optimization | false |
sample |
Number of samples to use for the operation | Processes all data |
timeout |
Timeout for each LLM call in seconds | 120 |
max_retries_per_timeout |
Maximum number of retries per timeout | 2 |
litellm_completion_kwargs |
Additional parameters to pass to LiteLLM completion calls. | {} |
Why use Parallel Map instead of multiple Map operations?
- Concurrency: Prompts run in parallel.
- Simplified Configuration: Multiple transformations in a single operation.
- Unified Output: Results from all prompts are combined into a single output item.
Example: Processing Job Applications
- name: process_job_application
type: parallel_map
prompts:
- name: extract_skills
prompt: "Given the following resume: '{{ input.resume }}', list the top 5 relevant skills for a software engineering position."
output_keys:
- skills
gleaning:
num_rounds: 1
validation_prompt: |
Confirm the skills list contains **exactly** 5 distinct skills and each skill is one or two words long.
model: gpt-4o-mini
- name: calculate_experience
prompt: "Based on the work history in this resume: '{{ input.resume }}', calculate the total years of relevant experience for a software engineering role."
output_keys:
- years_experience
model: gpt-4o-mini
- name: evaluate_cultural_fit
prompt: "Analyze the following cover letter: '{{ input.cover_letter }}'. Rate the candidate's potential cultural fit on a scale of 1-10, where 10 is the highest."
output_keys:
- cultural_fit_score
model: gpt-4o-mini
output:
schema:
skills: list[string]
years_experience: float
cultural_fit_score: integer
import docetl
docetl.default_model = "gpt-4o-mini"
frame = docetl.read_json("applications.json")
frame = frame.parallel_map(
name="process_job_application",
prompts=[
{
"name": "extract_skills",
"prompt": "Given the following resume: '{{ input.resume }}', list the top 5 relevant skills for a software engineering position.",
"output_keys": ["skills"],
"gleaning": {
"num_rounds": 1,
"validation_prompt": "Confirm the skills list contains **exactly** 5 distinct skills and each skill is one or two words long.",
},
"model": "gpt-4o-mini",
},
{
"name": "calculate_experience",
"prompt": "Based on the work history in this resume: '{{ input.resume }}', calculate the total years of relevant experience for a software engineering role.",
"output_keys": ["years_experience"],
"model": "gpt-4o-mini",
},
{
"name": "evaluate_cultural_fit",
"prompt": "Analyze the following cover letter: '{{ input.cover_letter }}'. Rate the candidate's potential cultural fit on a scale of 1-10, where 10 is the highest.",
"output_keys": ["cultural_fit_score"],
"model": "gpt-4o-mini",
},
],
output={
"schema": {
"skills": "list[string]",
"years_experience": "float",
"cultural_fit_score": "integer",
}
},
)
rows = frame.collect()
Advanced Validation: Per-Prompt Gleaning
Each prompt can include its own gleaning configuration. Gleaning works as described in the operators overview but is scoped to the individual LLM call for that prompt, so each transformation can have its own validation logic and validator model. The structure of the gleaning block is identical:
gleaning:
num_rounds: 1 # maximum refinement iterations
validation_prompt: | # judge prompt appended to the chat thread
Ensure the extracted skills list contains at least 5 distinct items.
model: gpt-4o-mini # (optional) model for the validator LLM
# Set on an individual prompt config inside prompts=[...]
"gleaning": {
"num_rounds": 1, # maximum refinement iterations
"validation_prompt": "Ensure the extracted skills list contains at least 5 distinct items.",
"model": "gpt-4o-mini", # (optional) model for the validator LLM
}
Example with Per-Prompt Gleaning
- name: process_job_application
type: parallel_map
prompts:
- name: extract_skills
prompt: "Given the following resume: '{{ input.resume }}', list the top 5 relevant skills for a software engineering position."
output_keys:
- skills
gleaning:
num_rounds: 1
validation_prompt: |
Confirm the skills list contains **exactly** 5 distinct skills and each skill is one or two words long.
model: gpt-4o-mini
- name: calculate_experience
prompt: "Based on the work history in this resume: '{{ input.resume }}', calculate the total years of relevant experience for a software engineering role."
output_keys:
- years_experience
gleaning:
num_rounds: 2
validation_prompt: |
Verify that the years of experience is a non-negative number and round to one decimal place if necessary.
- name: evaluate_cultural_fit
prompt: "Analyze the following cover letter: '{{ input.cover_letter }}'. Rate the candidate's potential cultural fit on a scale of 1-10, where 10 is the highest."
output_keys:
- cultural_fit_score
model: gpt-4o-mini
output:
schema:
skills: list[string]
years_experience: float
cultural_fit_score: integer
frame = frame.parallel_map(
name="process_job_application",
prompts=[
{
"name": "extract_skills",
"prompt": "Given the following resume: '{{ input.resume }}', list the top 5 relevant skills for a software engineering position.",
"output_keys": ["skills"],
"gleaning": {
"num_rounds": 1,
"validation_prompt": "Confirm the skills list contains **exactly** 5 distinct skills and each skill is one or two words long.",
},
"model": "gpt-4o-mini",
},
{
"name": "calculate_experience",
"prompt": "Based on the work history in this resume: '{{ input.resume }}', calculate the total years of relevant experience for a software engineering role.",
"output_keys": ["years_experience"],
"gleaning": {
"num_rounds": 2,
"validation_prompt": "Verify that the years of experience is a non-negative number and round to one decimal place if necessary.",
},
},
{
"name": "evaluate_cultural_fit",
"prompt": "Analyze the following cover letter: '{{ input.cover_letter }}'. Rate the candidate's potential cultural fit on a scale of 1-10, where 10 is the highest.",
"output_keys": ["cultural_fit_score"],
"model": "gpt-4o-mini",
},
],
output={
"schema": {
"skills": "list[string]",
"years_experience": "float",
"cultural_fit_score": "integer",
}
},
)
In this configuration, only the extract_skills and calculate_experience prompts use gleaning. Each prompt's validator runs immediately after its own LLM call and before the overall outputs are merged.
Best Practices
- Independent Transformations: Prompts must be truly independent of each other; they cannot reference each other's outputs.
- Output Schema Alignment: The output schema must capture all fields generated by the individual prompts.
- Lightweight Validators: When using per-prompt gleaning, keep validation prompts concise to limit cost and latency overhead.