Module 06: Multi-Agent Systems — References

Primary References

Anthropic Multi-Agent Guidance

URL: https://docs.anthropic.com/en/docs/build-with-claude/agents
What it covers: Anthropic’s official documentation on building agent systems with Claude. Covers tool use, agentic loops, multi-agent networks, subagent delegation, and the Claude Code Agent tool. Includes guidance on parallelization, prompt injection safety, and minimal footprint principles.
Best for: The canonical source for how Anthropic thinks about agent architecture. Read this before building any Claude-based agent system.

LangGraph

URL: https://langchain-ai.github.io/langgraph
What it covers: Full documentation for LangGraph, including tutorials on building stateful agents, multi-agent collaboration, human-in-the-loop, persistence, and streaming. The conceptual guides explain the node/edge/state model clearly.
Best for: Deep production usage of LangGraph. Start with the “Introduction” and “Concepts” sections before the tutorials. The “Multi-agent” section directly applies to this module.

AutoGen

URL: https://microsoft.github.io/autogen
What it covers: Microsoft’s multi-agent conversation framework. Documentation covers GroupChat, ConversableAgent, AssistantAgent, code execution sandboxes, and the multi-agent conversation protocol.
Best for: Conversation-native multi-agent systems. The quickstart examples for GroupChat show the core pattern in ~50 lines.

CrewAI

URL: https://docs.crewai.com
What it covers: Documentation for CrewAI including agents, tasks, crews, tools, and processes (sequential vs. hierarchical). Well-organized with many cookbook-style examples.
Best for: Rapid prototyping and understanding the role-based crew metaphor. Good for non-technical stakeholders to understand what’s being built.

AutoGen Research Paper

URL: https://arxiv.org/abs/2308.08155
Title: AutoGen: Enabling Next-Gen LLM Applications via Multi-Agent Conversation
Authors: Wu, Qingyun, et al. (Microsoft Research)
What it covers: The original AutoGen paper, which introduces the multi-agent conversation framework and benchmarks it on coding, math, and QA tasks. Section 3 (System Design) and Section 4 (Applications) are the most practical.
Best for: Understanding the theoretical motivation and empirical validation behind conversation-based multi-agent systems. Referenced in nearly every multi-agent survey.

Supplementary Reading

Practices for Governing Agentic AI Systems (Anthropic)

URL: https://www.anthropic.com/research/agentic-misalignment
What it covers: Anthropic’s guidance on safety and governance for agentic systems, including minimal footprint, prompt injection defense, and when agents should pause and verify with humans.
Best for: Production safety considerations — often overlooked in implementation-focused resources.

Building Effective Agents (Anthropic Blog)

URL: https://www.anthropic.com/engineering/building-effective-agents
What it covers: Anthropic’s engineering blog post on agent design principles: when to use workflows vs. agents, the value of simplicity, augmented LLMs, and five common agentic workflow patterns (prompt chaining, routing, parallelization, orchestrator-workers, evaluator-optimizer).
Best for: High-level design principles that should precede any implementation decision. The taxonomy of workflow patterns is directly relevant to this module.

Generative AI Design Patterns (Chip Huyen)

URL: https://huyenchip.com/2023/05/02/rag.html
What it covers: A detailed breakdown of AI application patterns including multi-agent architectures, RAG, and tool use. Well-written with concrete trade-off analyses.
Best for: Broad patterns survey with concrete trade-off discussions.

The Agent Protocol (AgentProtocol.ai)

URL: https://agentprotocol.ai
What it covers: A proposed standard API for agent communication — defines how agents expose themselves as services and how orchestrators call them. Useful even if you don’t adopt the standard, as it articulates what a well-defined inter-agent interface looks like.
Best for: Thinking about agent interoperability and what a production agent communication standard should specify.

Key Papers at a Glance

Paper	Core Contribution
AutoGen (2023)	Multi-agent conversation framework with code execution
MetaGPT (2023)	Role-based multi-agent system for software development
HuggingGPT (2023)	LLM as orchestrator for hundreds of specialized AI models
CAMEL (2023)	Role-playing multi-agent framework for generating conversations
OpenAgents (2023)	Three specialized agents (data, plugin, web) with unified interface

Tools and Libraries

Tool	Purpose	URL
LangGraph	Graph-based stateful agent workflows	https://langchain-ai.github.io/langgraph
AutoGen	Conversation-based multi-agent system	https://microsoft.github.io/autogen
CrewAI	Role-based multi-agent crews	https://docs.crewai.com
Prefect	Workflow orchestration (can wrap agents)	https://docs.prefect.io
Temporal	Durable workflow execution for long-running agents	https://temporal.io/docs
Ray	Distributed Python execution for parallel agents at scale	https://docs.ray.io
Pydantic	Data validation for inter-agent handoff schemas	https://docs.pydantic.dev

Debugging and Observability

Tool	Purpose	URL
LangSmith	Tracing and evaluation for LangChain/LangGraph	https://docs.smith.langchain.com
Arize Phoenix	Open-source LLM observability	https://docs.arize.com/phoenix
Helicone	LLM observability proxy (logs all API calls)	https://docs.helicone.ai
OpenTelemetry	Standard tracing instrumentation	https://opentelemetry.io/docs

Study Notes by Niladri & AI

Explorer

references