Module 06: Multi-Agent Systems — References


Primary References

Anthropic Multi-Agent Guidance

URL: https://docs.anthropic.com/en/docs/build-with-claude/agents
What it covers: Anthropic’s official documentation on building agent systems with Claude. Covers tool use, agentic loops, multi-agent networks, subagent delegation, and the Claude Code Agent tool. Includes guidance on parallelization, prompt injection safety, and minimal footprint principles.
Best for: The canonical source for how Anthropic thinks about agent architecture. Read this before building any Claude-based agent system.


LangGraph

URL: https://langchain-ai.github.io/langgraph
What it covers: Full documentation for LangGraph, including tutorials on building stateful agents, multi-agent collaboration, human-in-the-loop, persistence, and streaming. The conceptual guides explain the node/edge/state model clearly.
Best for: Deep production usage of LangGraph. Start with the “Introduction” and “Concepts” sections before the tutorials. The “Multi-agent” section directly applies to this module.


AutoGen

URL: https://microsoft.github.io/autogen
What it covers: Microsoft’s multi-agent conversation framework. Documentation covers GroupChat, ConversableAgent, AssistantAgent, code execution sandboxes, and the multi-agent conversation protocol.
Best for: Conversation-native multi-agent systems. The quickstart examples for GroupChat show the core pattern in ~50 lines.


CrewAI

URL: https://docs.crewai.com
What it covers: Documentation for CrewAI including agents, tasks, crews, tools, and processes (sequential vs. hierarchical). Well-organized with many cookbook-style examples.
Best for: Rapid prototyping and understanding the role-based crew metaphor. Good for non-technical stakeholders to understand what’s being built.


AutoGen Research Paper

URL: https://arxiv.org/abs/2308.08155
Title: AutoGen: Enabling Next-Gen LLM Applications via Multi-Agent Conversation
Authors: Wu, Qingyun, et al. (Microsoft Research)
What it covers: The original AutoGen paper, which introduces the multi-agent conversation framework and benchmarks it on coding, math, and QA tasks. Section 3 (System Design) and Section 4 (Applications) are the most practical.
Best for: Understanding the theoretical motivation and empirical validation behind conversation-based multi-agent systems. Referenced in nearly every multi-agent survey.


Supplementary Reading

Practices for Governing Agentic AI Systems (Anthropic)

URL: https://www.anthropic.com/research/agentic-misalignment
What it covers: Anthropic’s guidance on safety and governance for agentic systems, including minimal footprint, prompt injection defense, and when agents should pause and verify with humans.
Best for: Production safety considerations — often overlooked in implementation-focused resources.


Building Effective Agents (Anthropic Blog)

URL: https://www.anthropic.com/engineering/building-effective-agents
What it covers: Anthropic’s engineering blog post on agent design principles: when to use workflows vs. agents, the value of simplicity, augmented LLMs, and five common agentic workflow patterns (prompt chaining, routing, parallelization, orchestrator-workers, evaluator-optimizer).
Best for: High-level design principles that should precede any implementation decision. The taxonomy of workflow patterns is directly relevant to this module.


Generative AI Design Patterns (Chip Huyen)

URL: https://huyenchip.com/2023/05/02/rag.html
What it covers: A detailed breakdown of AI application patterns including multi-agent architectures, RAG, and tool use. Well-written with concrete trade-off analyses.
Best for: Broad patterns survey with concrete trade-off discussions.


The Agent Protocol (AgentProtocol.ai)

URL: https://agentprotocol.ai
What it covers: A proposed standard API for agent communication — defines how agents expose themselves as services and how orchestrators call them. Useful even if you don’t adopt the standard, as it articulates what a well-defined inter-agent interface looks like.
Best for: Thinking about agent interoperability and what a production agent communication standard should specify.


Key Papers at a Glance

PaperCore Contribution
AutoGen (2023)Multi-agent conversation framework with code execution
MetaGPT (2023)Role-based multi-agent system for software development
HuggingGPT (2023)LLM as orchestrator for hundreds of specialized AI models
CAMEL (2023)Role-playing multi-agent framework for generating conversations
OpenAgents (2023)Three specialized agents (data, plugin, web) with unified interface

Tools and Libraries

ToolPurposeURL
LangGraphGraph-based stateful agent workflowshttps://langchain-ai.github.io/langgraph
AutoGenConversation-based multi-agent systemhttps://microsoft.github.io/autogen
CrewAIRole-based multi-agent crewshttps://docs.crewai.com
PrefectWorkflow orchestration (can wrap agents)https://docs.prefect.io
TemporalDurable workflow execution for long-running agentshttps://temporal.io/docs
RayDistributed Python execution for parallel agents at scalehttps://docs.ray.io
PydanticData validation for inter-agent handoff schemashttps://docs.pydantic.dev

Debugging and Observability

ToolPurposeURL
LangSmithTracing and evaluation for LangChain/LangGraphhttps://docs.smith.langchain.com
Arize PhoenixOpen-source LLM observabilityhttps://docs.arize.com/phoenix
HeliconeLLM observability proxy (logs all API calls)https://docs.helicone.ai
OpenTelemetryStandard tracing instrumentationhttps://opentelemetry.io/docs