Deep Research Multi-Agent System Documentation

Based on verified PydanticAI documentation retrieved via the qdrant-docs MCP server, here's a comprehensive analysis of your deep research code:

Overview

This code implements a programmatic multi-agent workflow pattern—one of the four complexity levels supported by PydanticAI for multi-agent applications. The system orchestrates three specialized agents to perform deep research through a plan-execute-analyze pipeline.

Core Components

1. Structured Output Models

class WebSearchStep(BaseModel):
    """A step that performs a web search."""
    search_terms: str

class DeepResearchPlan(BaseModel, **ConfigDict(use_attribute_docstrings=True)):
    """A structured plan for deep research."""
    executive_summary: str
    """A summary of the research plan."""
    web_search_steps: Annotated[list[WebSearchStep], MaxLen(5)]
    """A list of web search steps to perform."""
    analysis_instructions: str
    """The analysis step to perform after all web search steps are completed."""

PydanticAI Documentation Context:

The output_type parameter on an Agent enforces structured outputs via Pydantic validation
ConfigDict(use_attribute_docstrings=True) extracts field descriptions from docstrings beneath each field—these become part of the schema sent to the LLM, improving output accuracy
Annotated[..., MaxLen(5)] uses annotated-types constraints to limit the list length, which is serialized into the JSON schema for the model

2. Planning Agent

plan_agent = Agent(
    'anthropic:claude-sonnet-4-5',
    instructions='Analyze the users query and design a plan for deep research...',
    output_type=DeepResearchPlan,
    name='abstract_plan_agent',
)

Key Features:

Model Specification: Uses the provider:model format (anthropic:claude-sonnet-4-5)
instructions: Equivalent to system prompt—guides the agent's behavior
output_type=DeepResearchPlan: Enforces structured output; the agent must return a validated Pydantic model
name: Used for observability in Logfire traces

3. Search Agent with Built-in Tools

search_agent = Agent(
    'google-vertex:gemini-2.5-flash',
    instructions='Perform a web search for the given terms...',
    builtin_tools=[WebSearchTool()],
    name='search_agent',
)

Built-in Tools Documentation:

WebSearchTool() is a built-in tool that leverages model-native web search capabilities
According to PydanticAI docs: "Built-in tools are passed to the model as part of the ModelRequestParameters" and are handled natively by supported providers
Unlike function tools, built-in tools are implemented by the model provider itself
The builtin_tools parameter accepts tools that inherit from AbstractBuiltinTool

4. Analysis Agent with Dependency Injection

analysis_agent = Agent(
    'anthropic:claude-sonnet-4-5',
    deps_type=AbstractAgent,
    instructions="""Analyze the research from the previous steps...""",
    name='analysis_agent',
)

Dependencies System:

deps_type=AbstractAgent: Declares the type of dependency this agent expects
Dependencies are provided at runtime via agent.run(..., deps=search_agent)
This enables agent delegation—the analysis agent can call the search agent through a tool

5. Tool Definition with RunContext

@analysis_agent.tool
async def extra_search(ctx: RunContext[AbstractAgent], query: str) -> str:
    """Perform an extra search for the given query."""
    result = await ctx.deps.run(query)
    return result.output

Function Tools Documentation:

@agent.tool decorator registers a tool that needs access to the agent context (RunContext)
RunContext[AbstractAgent] provides type-safe access to the injected dependency
ctx.deps retrieves the dependency (in this case, search_agent) passed during agent.run()
The tool's docstring becomes the tool description sent to the model
Tools can be async and return any JSON-serializable value

6. Observability with Logfire

logfire.configure()
logfire.instrument_pydantic_ai()

@logfire.instrument
async def deep_research(query: str) -> str:
    ...

Logfire Integration:

logfire.configure(): Initializes the Logfire client
logfire.instrument_pydantic_ai(): Automatically instruments all PydanticAI agents for tracing
@logfire.instrument: Decorator to create custom spans around functions
Provides visibility into agent runs, tool calls, token usage, and latency

7. format_as_xml Utility

analysis_result = await analysis_agent.run(
    format_as_xml({
        'query': query,
        'search_results': search_results,
        'instructions': plan.analysis_instructions,
    }),
    deps=search_agent,
)

Documentation:

format_as_xml() formats Python objects as XML for model consumption
According to PydanticAI: "LLMs often find it easier to read semi-structured data (e.g. examples) as XML, rather than JSON"
Supports: str, bytes, bool, int, float, date, datetime, Mapping, Iterable, dataclass, and BaseModel
Parameters include root_tag, item_tag, indent, and include_field_info for customization

Execution Flow

async def deep_research(query: str) -> str:
    # 1. Planning Phase
    result = await plan_agent.run(query)
    plan = result.output  # DeepResearchPlan instance
    
    # 2. Parallel Search Execution
    async with asyncio.TaskGroup() as tg:
        tasks = [tg.create_task(search_agent.run(step.search_terms)) 
                 for step in plan.web_search_steps]
    search_results = [task.result().output for task in tasks]
    
    # 3. Analysis Phase (with delegation capability)
    analysis_result = await analysis_agent.run(
        format_as_xml({...}),
        deps=search_agent,  # Enables extra_search tool
    )
    return analysis_result.output

Pattern: Programmatic Agent Hand-off

This implements what PydanticAI calls "programmatic agent hand-off"—one agent runs, then application code orchestrates calling another agent
The TaskGroup enables parallel execution of search steps for efficiency
The analysis agent can perform additional searches via the extra_search tool if needed

Architecture Summary

Component	Model	Role	Special Features
`plan_agent`	Claude Sonnet 4.5	Research planning	Structured output (`DeepResearchPlan`)
`search_agent`	Gemini 2.5 Flash	Web information retrieval	`WebSearchTool` built-in
`analysis_agent`	Claude Sonnet 4.5	Synthesis & analysis	Dependency injection, delegation tool

Best Practices Demonstrated

Type Safety: Full type hints with deps_type, output_type, and RunContext[T]
Observability: Logfire instrumentation for debugging and monitoring
Parallel Execution: asyncio.TaskGroup for concurrent search operations
Agent Delegation: Analysis agent can request additional searches through dependency injection
Structured Outputs: Pydantic models with docstring-based field descriptions

Last Verified: 2025-12-10 17:44 PST
Documentation Source: qdrant-docs (FastMCP) - PydanticAI documentation collection
Verification Tools Used: qdrant-docs:search_docs, mcp-server-time:get_current_time

What would you like me to research next? I could dive deeper into any of these aspects, explore WebSearchTool configuration options, or investigate alternative patterns for multi-agent orchestration.

donbr/pydanticai-deep-research-baseline.md

Select an option

No results found