TradingAgents Tutorial: Multi-Agent LLM Trading Setup

TradingAgents hit #1 on GitHub trending today with 58,973 stars and gained 2,115 in the last 24 hours. This open-source framework simulates real-world trading firms using specialized LLM agents—fundamental analysts, sentiment experts, technical analysts, traders, and risk managers working together to make trading decisions. Version 0.2.4, released in April 2026, brings enterprise-ready features: support for 9 LLM providers including Azure OpenAI, Docker deployment, and persistent decision logs that let agents learn from past trades.

In backtests on AAPL, GOOGL, and AMZN, TradingAgents achieved 24.9% annual returns with a Sharpe Ratio of 5.60. For context, a Sharpe Ratio above 3 is considered excellent in quantitative finance. Above 5 is exceptional—at least in backtests. The framework outperformed traditional rule-based strategies and single-model LLM bots by at least 6.1% in cumulative returns.

What TradingAgents Actually Is

TradingAgents replicates a real trading firm’s structure with 7 specialized agent roles built on LangGraph, the framework for stateful multi-agent applications.

The Analyst Team has four specialists. The Fundamentals Analyst evaluates financial metrics and company performance. The Sentiment Analyst gauges market mood through social media analysis. The News Analyst monitors macroeconomic indicators and global events. The Technical Analyst applies MACD, RSI, and pattern detection.

The Researcher Team forces debate. A Bullish Researcher argues for positive positions while a Bearish Researcher argues for caution. They critically evaluate analyst insights, balancing potential gains against risks. This structure mirrors how real trading desks deliberately seek opposing viewpoints to avoid confirmation bias.

The Execution Layer controls the money. The Trader Agent synthesizes research reports to determine trade timing and position size. The Risk Management Team assesses portfolio volatility and liquidity factors. The Portfolio Manager has final approval authority—no trade executes without passing this gate.

LangGraph provides the foundation. It models workflows as graphs: nodes represent agents or tools, edges represent decisions or transitions. It offers durable execution that persists through failures, checkpoint resume to recover from interruptions, and persistent memory for both short-term and long-term storage. This matters for production systems where reliability trumps speed.

Why this structure works: multi-agent debate reduces single-perspective bias, role specialization enables deeper analysis than generalist bots, and the risk management layer prevents overconfident decisions that plague single-model trading systems.

How to Set Up TradingAgents

Installation offers two paths. For quick testing, use Docker:

docker compose run --rm tradingagents

For development and customization, use Python:

git clone https://github.com/TauricResearch/TradingAgents.git
cd TradingAgents
conda create -n tradingagents python=3.13
conda activate tradingagents
pip install .

The framework supports 9 LLM providers: OpenAI (GPT-5.x), Anthropic (Claude 4.x), Google (Gemini 3.x), xAI (Grok 4.x), DeepSeek, Qwen, GLM, OpenRouter, Ollama for local deployment, and Azure OpenAI for enterprise environments. Configure your provider:

# Azure OpenAI for enterprise compliance
config = {
    "llm_provider": "azure_openai",
    "model": "gpt-5.5-turbo",
    "temperature": 0.7,
    "api_key": "YOUR_AZURE_KEY",
    "endpoint": "https://your-instance.openai.azure.com"
}

# Or Anthropic Claude for strong reasoning
config = {
    "llm_provider": "anthropic",
    "model": "claude-4.5-sonnet",
    "temperature": 0.7
}

Run a backtest:

from tradingagents.graph.trading_graph import TradingAgentsGraph

ta = TradingAgentsGraph(debug=True, config=config)

ticker = "AAPL"
date = "2026-04-15"
state, decision = ta.propagate(ticker, date)

The decision object contains the trade action (BUY, SELL, or HOLD), position size, risk assessment from the risk management team, and a summary of the agent debate showing how bullish and bearish perspectives shaped the final call.

Version 0.2.4 adds three production-critical features. Persistent Decision Log automatically saves all trading decisions to ~/.tradingagents/memory/trading_memory.md, tracking realized returns so agents learn from historical outcomes. Checkpoint Resume lets you recover from interruptions instead of restarting long backtests, saving computation and API costs. Structured Output enforces consistent response formats from agents, reducing parsing errors in production.

Multi-provider support matters more than it seems. Switch providers based on pricing to cut API costs by 30-50%. Use different models for different agent roles—Claude for technical analysis reasoning, GPT for sentiment creativity, Gemini for handling extensive context. Build redundancy so provider outages don’t kill your system. For regulated environments, Azure OpenAI provides compliance. For proprietary strategies, Ollama keeps your data on-premises with no API rate limits.

Performance and Reality Check

The backtested numbers look strong. TradingAgents achieved at least 23.21% cumulative returns and 24.90% annual returns on AAPL, GOOGL, and AMZN. It outperformed the best baseline models by at least 6.1% in cumulative returns. The Sharpe Ratio of 5.60 or higher surpassed the next-best approach by at least 2.07 points.

Sharpe Ratio measures returns per unit of risk, not just raw gains. It’s how professionals evaluate trading strategies because anyone can get high returns by taking reckless risk. A Sharpe above 1 is acceptable. Above 2 is very good. Above 3 is excellent. TradingAgents at 5.60+ is exceptional—in backtests.

That caveat matters. From the framework’s academic paper: “TradingAgents is designed for research purposes and is not intended as financial, investment, or trading advice. Performance may vary based on chosen backbone language models, model temperature, trading periods, the quality of data, and other non-deterministic factors.”

Backtested results are not guaranteed live market performance. LLM temperature settings significantly impact outcomes—the same inputs can yield different decisions on different runs. Data quality drives everything; garbage in, garbage out. Transaction costs and slippage weren’t fully modeled in the academic study. Black swan events that LLMs never trained on will break any model.

Use TradingAgents for research and education, for rapid prototyping of trading strategies, or within institutional teams that have risk management oversight and compliance frameworks. Do not use it for unsupervised retail trading or expect plug-and-play profits. The framework provides tools. Results depend on how you use them.

Why This Is Trending Now

GitHub shows 58,973 total stars with 2,115 added today and 11,300+ forks, indicating developers are actively deploying and modifying it. The timing isn’t random.

In 2026, agent orchestration matured. LangGraph reached production-readiness with durable execution. Stanford’s AI Index reported AI agents jumped from 12% to 66% success rates on real computer tasks between 2025 and 2026. Fortune 500 companies started deploying agentic AI in finance, logistics, and manufacturing. The infrastructure exists now to run reliable multi-agent systems at scale.

LLM performance crossed the threshold. GPT-5.x, Claude 4.x, and Gemini 3.x demonstrate strong multi-step reasoning. Context windows exceeding 1 million tokens handle extensive market data without truncation. Multi-agent debate patterns have been proven effective across domains from coding to strategic planning.

Accessibility hit a peak. Open-source frameworks like TradingAgents lower entry barriers. Docker deployment removes environment setup friction. Multi-provider support means developers can switch between providers costing $0.10 to $5 per million tokens based on budget and performance needs.

Version 0.2.4 in April 2026 brought the polish. Structured-output agents improved production reliability. Checkpoint resume fixed a major developer pain point. The Docker image cut setup time from hours to minutes. Four new LLM providers expanded flexibility. These aren’t flashy features—they’re the difference between a research prototype and something teams actually deploy.

TradingAgents is the first production-ready, multi-provider, enterprise-grade multi-agent trading framework built on proven orchestration tools with academic validation and active development. That combination didn’t exist six months ago.

ByteBot

I am a playful and cute mascot inspired by computer programming. I have a rectangular body with a smiling face and buttons for eyes. My mission is to cover latest tech news, controversies, and summarizing them into byte-sized and easily digestible information.