Module 06: Multi-Agent Systems — References
Primary References
Anthropic Multi-Agent Guidance
URL: https://docs.anthropic.com/en/docs/build-with-claude/agents
What it covers: Anthropic’s official documentation on building agent systems with Claude. Covers tool use, agentic loops, multi-agent networks, subagent delegation, and the Claude Code Agent tool. Includes guidance on parallelization, prompt injection safety, and minimal footprint principles.
Best for: The canonical source for how Anthropic thinks about agent architecture. Read this before building any Claude-based agent system.
LangGraph
URL: https://langchain-ai.github.io/langgraph
What it covers: Full documentation for LangGraph, including tutorials on building stateful agents, multi-agent collaboration, human-in-the-loop, persistence, and streaming. The conceptual guides explain the node/edge/state model clearly.
Best for: Deep production usage of LangGraph. Start with the “Introduction” and “Concepts” sections before the tutorials. The “Multi-agent” section directly applies to this module.
AutoGen
URL: https://microsoft.github.io/autogen
What it covers: Microsoft’s multi-agent conversation framework. Documentation covers GroupChat, ConversableAgent, AssistantAgent, code execution sandboxes, and the multi-agent conversation protocol.
Best for: Conversation-native multi-agent systems. The quickstart examples for GroupChat show the core pattern in ~50 lines.
CrewAI
URL: https://docs.crewai.com
What it covers: Documentation for CrewAI including agents, tasks, crews, tools, and processes (sequential vs. hierarchical). Well-organized with many cookbook-style examples.
Best for: Rapid prototyping and understanding the role-based crew metaphor. Good for non-technical stakeholders to understand what’s being built.
AutoGen Research Paper
URL: https://arxiv.org/abs/2308.08155
Title: AutoGen: Enabling Next-Gen LLM Applications via Multi-Agent Conversation
Authors: Wu, Qingyun, et al. (Microsoft Research)
What it covers: The original AutoGen paper, which introduces the multi-agent conversation framework and benchmarks it on coding, math, and QA tasks. Section 3 (System Design) and Section 4 (Applications) are the most practical.
Best for: Understanding the theoretical motivation and empirical validation behind conversation-based multi-agent systems. Referenced in nearly every multi-agent survey.
Supplementary Reading
Practices for Governing Agentic AI Systems (Anthropic)
URL: https://www.anthropic.com/research/agentic-misalignment
What it covers: Anthropic’s guidance on safety and governance for agentic systems, including minimal footprint, prompt injection defense, and when agents should pause and verify with humans.
Best for: Production safety considerations — often overlooked in implementation-focused resources.
Building Effective Agents (Anthropic Blog)
URL: https://www.anthropic.com/engineering/building-effective-agents
What it covers: Anthropic’s engineering blog post on agent design principles: when to use workflows vs. agents, the value of simplicity, augmented LLMs, and five common agentic workflow patterns (prompt chaining, routing, parallelization, orchestrator-workers, evaluator-optimizer).
Best for: High-level design principles that should precede any implementation decision. The taxonomy of workflow patterns is directly relevant to this module.
Generative AI Design Patterns (Chip Huyen)
URL: https://huyenchip.com/2023/05/02/rag.html
What it covers: A detailed breakdown of AI application patterns including multi-agent architectures, RAG, and tool use. Well-written with concrete trade-off analyses.
Best for: Broad patterns survey with concrete trade-off discussions.
The Agent Protocol (AgentProtocol.ai)
URL: https://agentprotocol.ai
What it covers: A proposed standard API for agent communication — defines how agents expose themselves as services and how orchestrators call them. Useful even if you don’t adopt the standard, as it articulates what a well-defined inter-agent interface looks like.
Best for: Thinking about agent interoperability and what a production agent communication standard should specify.
Key Papers at a Glance
| Paper | Core Contribution |
|---|---|
| AutoGen (2023) | Multi-agent conversation framework with code execution |
| MetaGPT (2023) | Role-based multi-agent system for software development |
| HuggingGPT (2023) | LLM as orchestrator for hundreds of specialized AI models |
| CAMEL (2023) | Role-playing multi-agent framework for generating conversations |
| OpenAgents (2023) | Three specialized agents (data, plugin, web) with unified interface |
Tools and Libraries
| Tool | Purpose | URL |
|---|---|---|
| LangGraph | Graph-based stateful agent workflows | https://langchain-ai.github.io/langgraph |
| AutoGen | Conversation-based multi-agent system | https://microsoft.github.io/autogen |
| CrewAI | Role-based multi-agent crews | https://docs.crewai.com |
| Prefect | Workflow orchestration (can wrap agents) | https://docs.prefect.io |
| Temporal | Durable workflow execution for long-running agents | https://temporal.io/docs |
| Ray | Distributed Python execution for parallel agents at scale | https://docs.ray.io |
| Pydantic | Data validation for inter-agent handoff schemas | https://docs.pydantic.dev |
Debugging and Observability
| Tool | Purpose | URL |
|---|---|---|
| LangSmith | Tracing and evaluation for LangChain/LangGraph | https://docs.smith.langchain.com |
| Arize Phoenix | Open-source LLM observability | https://docs.arize.com/phoenix |
| Helicone | LLM observability proxy (logs all API calls) | https://docs.helicone.ai |
| OpenTelemetry | Standard tracing instrumentation | https://opentelemetry.io/docs |