Chapter 65: Retrieval-Augmented Generation (RAG) for Trading
Chapter 65: Retrieval-Augmented Generation (RAG) for Trading
This chapter explores Retrieval-Augmented Generation (RAG), a powerful technique that combines large language models with external knowledge retrieval to enhance trading decisions. RAG enables traders to leverage real-time financial data, news, research reports, and historical market information to generate contextually relevant trading signals and analysis.
Contents
- Introduction to RAG for Trading
- RAG Architecture
- Trading Applications
- Practical Examples
- Rust Implementation
- Python Implementation
- Best Practices
- Resources
Introduction to RAG for Trading
What is RAG?
Retrieval-Augmented Generation (RAG) is a hybrid approach that combines the generative capabilities of Large Language Models (LLMs) with information retrieval systems. Instead of relying solely on the model’s parametric knowledge, RAG retrieves relevant documents from an external knowledge base and uses them as context for generation.
RAG WORKFLOW FOR TRADING:┌──────────────────────────────────────────────────────────────────────────────┐│ ││ User Query: "What's the latest news about Tesla and how might it ││ affect tomorrow's stock price?" ││ ││ ┌────────────────────────────────────────────────────────────────────────┐ ││ │ Step 1: RETRIEVAL │ ││ │ ┌─────────────────────┐ ┌─────────────────────────────────────┐ │ ││ │ │ Query Embedding │ ──▶ │ Vector Database Search │ │ ││ │ │ "Tesla news..." │ │ • Recent news articles │ │ ││ │ └─────────────────────┘ │ • SEC filings │ │ ││ │ │ • Analyst reports │ │ ││ │ │ • Social media sentiment │ │ ││ │ └─────────────────────────────────────┘ │ ││ └────────────────────────────────────────────────────────────────────────┘ ││ │ ││ ▼ ││ ┌────────────────────────────────────────────────────────────────────────┐ ││ │ Step 2: AUGMENTATION │ ││ │ Retrieved Documents: │ ││ │ • "Tesla Q3 deliveries beat estimates by 15%..." │ ││ │ • "Elon Musk announces new factory in Texas..." │ ││ │ • "Analyst upgrades TSLA to Buy with $300 target..." │ ││ │ + Original Query │ ││ └────────────────────────────────────────────────────────────────────────┘ ││ │ ││ ▼ ││ ┌────────────────────────────────────────────────────────────────────────┐ ││ │ Step 3: GENERATION │ ││ │ LLM generates response with retrieved context: │ ││ │ "Based on recent developments, Tesla has several positive catalysts: │ ││ │ 1. Q3 deliveries exceeded expectations (+15%) │ ││ │ 2. New Texas factory announcement signals expansion │ ││ │ 3. Multiple analyst upgrades suggest bullish sentiment │ ││ │ Signal: MODERATE BUY (confidence: 72%)" │ ││ └────────────────────────────────────────────────────────────────────────┘ ││ │└──────────────────────────────────────────────────────────────────────────────┘Why RAG for Trading?
Traditional LLMs have several limitations for trading applications:
| Challenge | Traditional LLM | RAG Solution |
|---|---|---|
| Knowledge Cutoff | Trained on historical data, unaware of recent events | Retrieves real-time information |
| Hallucinations | May generate plausible but incorrect facts | Grounds responses in retrieved documents |
| Source Attribution | Cannot cite sources for claims | Provides explicit document references |
| Domain Specificity | General knowledge may miss financial nuances | Retrieves domain-specific documents |
| Updatability | Requires expensive retraining | Simply update document database |
RAG vs Traditional NLP
COMPARISON: RAG vs FINE-TUNED LLM vs TRADITIONAL NLP═══════════════════════════════════════════════════════════════════════════════
Traditional NLP Fine-tuned LLM RAG ──────────────── ──────────────── ────────────────Knowledge Update Retrain model Retrain model Update DB only (days/weeks) (hours/days) (seconds)
Real-time Info Not possible Not possible ✅ Supported
Cost per Update $$$$ $$$ $
Explainability Low Medium High (citations)
Accuracy on Moderate Good Very GoodRecent Events
Hallucination N/A High risk Low riskRisk
Scalability Fixed capacity Fixed capacity Scales with DB
Best Use Case Static tasks Domain adaptation Dynamic knowledgeRAG Architecture
Core Components
A RAG system for trading consists of four main components:
RAG ARCHITECTURE FOR TRADING┌──────────────────────────────────────────────────────────────────────────────┐│ ││ ┌─────────────────────────────────────────────────────────────────────┐ ││ │ DOCUMENT INGESTION LAYER │ ││ │ ┌─────────────┐ ┌─────────────┐ ┌─────────────┐ ┌────────────┐ │ ││ │ │ News APIs │ │ SEC Edgar │ │ Research │ │ Social │ │ ││ │ │ (Bloomberg, │ │ (10-K, 10-Q,│ │ Reports │ │ Media │ │ ││ │ │ Reuters) │ │ 8-K, etc) │ │ (Analysts) │ │ (Twitter) │ │ ││ │ └──────┬──────┘ └──────┬──────┘ └──────┬──────┘ └─────┬──────┘ │ ││ └─────────┼────────────────┼────────────────┼───────────────┼─────────┘ ││ └────────────────┼────────────────┼───────────────┘ ││ ▼ ▼ ││ ┌─────────────────────────────────────────────────────────────────────┐ ││ │ DOCUMENT PROCESSING │ ││ │ ┌─────────────┐ ┌─────────────┐ ┌─────────────┐ ┌────────────┐ │ ││ │ │ Chunking │──│ Cleaning │──│ Metadata │──│ Embedding │ │ ││ │ │ (by topic, │ │ (normalize, │ │ Extraction │ │ Generation │ │ ││ │ │ section) │ │ dedupe) │ │ (date,ticker│ │ (OpenAI, │ │ ││ │ │ │ │ │ │ source) │ │ local) │ │ ││ │ └─────────────┘ └─────────────┘ └─────────────┘ └────────────┘ │ ││ └─────────────────────────────────────────────────────────────────────┘ ││ │ ││ ▼ ││ ┌─────────────────────────────────────────────────────────────────────┐ ││ │ VECTOR STORE │ ││ │ ┌─────────────────────────────────────────────────────────────┐ │ ││ │ │ Document Embeddings (1536-dim vectors) │ │ ││ │ │ ┌──────────────────────────────────────────────────────┐ │ │ ││ │ │ │ ID: doc_001 | Ticker: TSLA | Date: 2024-01-15 │ │ │ ││ │ │ │ Vector: [0.021, -0.045, 0.089, ..., 0.012] │ │ │ ││ │ │ │ Text: "Tesla reported Q4 deliveries of 484,507..." │ │ │ ││ │ │ └──────────────────────────────────────────────────────┘ │ │ ││ │ │ ┌──────────────────────────────────────────────────────┐ │ │ ││ │ │ │ ID: doc_002 | Ticker: AAPL | Date: 2024-01-16 │ │ │ ││ │ │ │ Vector: [0.015, 0.032, -0.067, ..., 0.045] │ │ │ ││ │ │ │ Text: "Apple Vision Pro pre-orders exceed..." │ │ │ ││ │ │ └──────────────────────────────────────────────────────┘ │ │ ││ │ └─────────────────────────────────────────────────────────────┘ │ ││ │ Storage Options: ChromaDB | Pinecone | Weaviate | FAISS │ ││ └─────────────────────────────────────────────────────────────────────┘ ││ │ ││ ▼ ││ ┌─────────────────────────────────────────────────────────────────────┐ ││ │ RETRIEVAL & GENERATION │ ││ │ ┌──────────────────────────┐ ┌──────────────────────────┐ │ ││ │ │ Semantic Search │ │ LLM Generation │ │ ││ │ │ • Query embedding │───▶│ • Context integration │ │ ││ │ │ • k-NN retrieval │ │ • Trading analysis │ │ ││ │ │ • Re-ranking │ │ • Signal generation │ │ ││ │ │ • Filtering (date, │ │ • Risk assessment │ │ ││ │ │ ticker, relevance) │ │ • Source citation │ │ ││ │ └──────────────────────────┘ └──────────────────────────┘ │ ││ └─────────────────────────────────────────────────────────────────────┘ ││ │└──────────────────────────────────────────────────────────────────────────────┘Document Processing Pipeline
Effective RAG requires careful document processing:
# Document processing pipeline for financial documentsclass FinancialDocumentProcessor: """ Process financial documents for RAG indexing.
Handles: - News articles - SEC filings (10-K, 10-Q, 8-K) - Earnings transcripts - Research reports """
def __init__(self, chunk_size: int = 512, chunk_overlap: int = 50): self.chunk_size = chunk_size self.chunk_overlap = chunk_overlap
def process(self, document: str, metadata: dict) -> List[DocumentChunk]: """ Process document into chunks with metadata.
Args: document: Raw document text metadata: Document metadata (source, date, tickers)
Returns: List of processed document chunks """ # Step 1: Clean and normalize cleaned = self._clean_text(document)
# Step 2: Extract additional metadata entities = self._extract_entities(cleaned) metadata["entities"] = entities
# Step 3: Chunk document chunks = self._chunk_document(cleaned)
# Step 4: Generate embeddings embeddings = self._generate_embeddings(chunks)
return [ DocumentChunk( text=chunk, embedding=emb, metadata={**metadata, "chunk_idx": i} ) for i, (chunk, emb) in enumerate(zip(chunks, embeddings)) ]Vector Stores and Embeddings
Choosing the right embedding model and vector store is crucial:
| Embedding Model | Dimensions | Best For | Cost |
|---|---|---|---|
| OpenAI text-embedding-3-large | 3072 | High accuracy | API cost |
| OpenAI text-embedding-3-small | 1536 | Balance | API cost |
| Sentence-BERT | 768 | Privacy, offline | Free |
| FinBERT Embeddings | 768 | Financial domain | Free |
| Vector Store | Scalability | Features | Deployment |
|---|---|---|---|
| ChromaDB | Small-Medium | Easy setup, metadata | Local/Cloud |
| FAISS | Large | High performance | Local |
| Pinecone | Very Large | Managed, fast | Cloud |
| Weaviate | Large | GraphQL, hybrid search | Self-hosted/Cloud |
| Qdrant | Large | Fast, Rust-based | Self-hosted/Cloud |
Trading Applications
Real-Time News Analysis
RAG enables sophisticated news analysis for trading:
# Real-time news analysis with RAGclass NewsRAGAnalyzer: """ Analyze real-time news for trading signals using RAG. """
def analyze_news(self, query: str, tickers: List[str]) -> TradingSignal: """ Analyze news and generate trading signals.
Example: >>> analyzer = NewsRAGAnalyzer() >>> signal = analyzer.analyze_news( ... "What's the market sentiment on TSLA today?", ... tickers=["TSLA"] ... ) >>> print(signal) TradingSignal( ticker="TSLA", direction="LONG", confidence=0.72, reasoning="Based on 3 recent news articles...", sources=["Reuters", "Bloomberg", "WSJ"] ) """ # Retrieve relevant documents docs = self.retriever.search( query=query, filters={"ticker": {"$in": tickers}}, top_k=10 )
# Generate analysis with LLM context = self._format_context(docs) prompt = self._build_prompt(query, context)
response = self.llm.generate(prompt)
return self._parse_signal(response, docs)SEC Filing Analysis
Automated analysis of regulatory filings:
# Example: 10-K Filing Analysisfiling_analysis = rag_analyzer.analyze( query="What are the main risk factors mentioned in Tesla's latest 10-K?", document_types=["10-K"], tickers=["TSLA"])
# Output:"""Based on Tesla's 2023 10-K filing, the main risk factors include:
1. **Production Capacity Risks** (Section 1A, Page 15) - Dependency on Gigafactory output - Supply chain constraints for batteries
2. **Regulatory Risks** (Section 1A, Page 18) - EV tax credit eligibility changes - Autonomous driving regulations
3. **Competition Risks** (Section 1A, Page 21) - Increasing EV competition from legacy automakers - Chinese EV manufacturers entering US market
4. **Key Person Risk** (Section 1A, Page 24) - Heavy reliance on Elon Musk
Sources: [SEC 10-K Filing dated 2024-01-29, pages 15-24]"""Earnings Call Intelligence
Extract insights from earnings calls:
EARNINGS CALL RAG ANALYSIS═══════════════════════════════════════════════════════════════════════════════
Query: "What guidance did Apple provide for next quarter?"
Retrieved Context:┌──────────────────────────────────────────────────────────────────────────────┐│ [1] Apple Q4 2024 Earnings Call Transcript (Oct 31, 2024) ││ "Looking ahead to Q1, we expect revenue between $118-122 billion, ││ representing 5-8% year-over-year growth. Services should continue ││ its strong momentum with double-digit growth expected." │├──────────────────────────────────────────────────────────────────────────────┤│ [2] Apple CFO Prepared Remarks ││ "Gross margin guidance for Q1 is between 45% and 46%, consistent ││ with our historical Q1 seasonality patterns." │├──────────────────────────────────────────────────────────────────────────────┤│ [3] Analyst Q&A Session ││ Q: "Can you comment on iPhone demand in China?" ││ A: "We're seeing healthy demand across all geographies. China ││ continues to be our fastest-growing market for Services." │└──────────────────────────────────────────────────────────────────────────────┘
Generated Analysis:┌──────────────────────────────────────────────────────────────────────────────┐│ Apple Q1 FY2025 Guidance Summary: ││ ││ • Revenue: $118-122B (5-8% YoY growth) ││ • Gross Margin: 45-46% ││ • Services: Double-digit growth expected ││ • Geographic: Strong China demand, especially in Services ││ ││ Trading Implication: NEUTRAL to MILDLY BULLISH ││ - Guidance in-line with consensus ($120B) ││ - Services growth provides margin tailwind ││ - China commentary addresses key investor concern ││ ││ Confidence: 78% ││ Sources: Q4 2024 Earnings Call Transcript, CFO Remarks, Q&A Session │└──────────────────────────────────────────────────────────────────────────────┘Market Research Synthesis
Combine multiple research sources:
# Multi-source research synthesissynthesis = rag_system.synthesize( query="What's the consensus view on semiconductor stocks for 2024?", sources=[ "analyst_reports", "earnings_calls", "news_articles", "industry_reports" ], tickers=["NVDA", "AMD", "INTC", "TSM"])
# Returns structured analysis combining all sourcesPractical Examples
01: Document Retrieval System
"""Example 01: Building a Financial Document Retrieval System
This example demonstrates how to build a document retrieval systemfor financial documents using embeddings and vector search."""
import numpy as npfrom typing import List, Dict, Optionalfrom dataclasses import dataclassfrom datetime import datetime
@dataclassclass Document: """Financial document with metadata.""" id: str text: str ticker: Optional[str] source: str date: datetime doc_type: str # "news", "filing", "earnings", "research"
@dataclassclass SearchResult: """Search result with relevance score.""" document: Document score: float highlights: List[str]
class FinancialDocumentRetriever: """ Retrieval system for financial documents.
Uses semantic search with embeddings to find relevant documents based on natural language queries. """
def __init__(self, embedding_model: str = "sentence-transformers/all-MiniLM-L6-v2"): self.embedding_model = embedding_model self.documents: List[Document] = [] self.embeddings: Optional[np.ndarray] = None self._encoder = None
def _get_encoder(self): """Lazy load the embedding model.""" if self._encoder is None: try: from sentence_transformers import SentenceTransformer self._encoder = SentenceTransformer(self.embedding_model) except ImportError: raise ImportError( "sentence-transformers required. " "Install with: pip install sentence-transformers" ) return self._encoder
def add_documents(self, documents: List[Document]) -> None: """ Add documents to the retrieval index.
Args: documents: List of documents to index """ encoder = self._get_encoder()
# Generate embeddings for new documents texts = [doc.text for doc in documents] new_embeddings = encoder.encode(texts, convert_to_numpy=True)
# Add to index self.documents.extend(documents)
if self.embeddings is None: self.embeddings = new_embeddings else: self.embeddings = np.vstack([self.embeddings, new_embeddings])
def search( self, query: str, top_k: int = 5, ticker: Optional[str] = None, doc_type: Optional[str] = None, min_date: Optional[datetime] = None ) -> List[SearchResult]: """ Search for relevant documents.
Args: query: Natural language search query top_k: Number of results to return ticker: Filter by ticker symbol doc_type: Filter by document type min_date: Filter by minimum date
Returns: List of search results with scores """ if self.embeddings is None or len(self.documents) == 0: return []
encoder = self._get_encoder()
# Encode query query_embedding = encoder.encode([query], convert_to_numpy=True)[0]
# Compute cosine similarity similarities = np.dot(self.embeddings, query_embedding) / ( np.linalg.norm(self.embeddings, axis=1) * np.linalg.norm(query_embedding) )
# Apply filters mask = np.ones(len(self.documents), dtype=bool)
for i, doc in enumerate(self.documents): if ticker and doc.ticker != ticker: mask[i] = False if doc_type and doc.doc_type != doc_type: mask[i] = False if min_date and doc.date < min_date: mask[i] = False
# Set filtered documents to low score similarities[~mask] = -1
# Get top-k results top_indices = np.argsort(similarities)[-top_k:][::-1]
results = [] for idx in top_indices: if similarities[idx] > 0: results.append(SearchResult( document=self.documents[idx], score=float(similarities[idx]), highlights=self._extract_highlights( self.documents[idx].text, query ) ))
return results
def _extract_highlights(self, text: str, query: str) -> List[str]: """Extract relevant text snippets.""" # Simple keyword-based highlighting sentences = text.split('. ') query_words = set(query.lower().split())
scored_sentences = [] for sent in sentences: score = sum(1 for word in query_words if word in sent.lower()) if score > 0: scored_sentences.append((score, sent))
scored_sentences.sort(reverse=True) return [sent for _, sent in scored_sentences[:3]]02: Trading Signal Generation
"""Example 02: RAG-based Trading Signal Generation
This example shows how to generate trading signals by combiningdocument retrieval with LLM-based analysis."""
from enum import Enumfrom typing import List, Optionalfrom dataclasses import dataclass
class SignalDirection(Enum): LONG = "LONG" SHORT = "SHORT" NEUTRAL = "NEUTRAL"
@dataclassclass TradingSignal: """Generated trading signal with reasoning.""" ticker: str direction: SignalDirection confidence: float # 0-1 reasoning: str sources: List[str] timestamp: datetime
class RAGTradingSignalGenerator: """ Generate trading signals using RAG.
Combines document retrieval with LLM analysis to produce actionable trading signals with explanations. """
def __init__( self, retriever: FinancialDocumentRetriever, llm_client: Optional[object] = None ): self.retriever = retriever self.llm_client = llm_client
def generate_signal( self, ticker: str, query: Optional[str] = None ) -> TradingSignal: """ Generate a trading signal for a ticker.
Args: ticker: Stock ticker symbol query: Optional custom query (default: general sentiment)
Returns: Trading signal with reasoning and sources """ # Default query if not provided if query is None: query = f"What is the current market sentiment and outlook for {ticker}?"
# Retrieve relevant documents results = self.retriever.search( query=query, ticker=ticker, top_k=5 )
if not results: return TradingSignal( ticker=ticker, direction=SignalDirection.NEUTRAL, confidence=0.0, reasoning="No relevant documents found for analysis.", sources=[], timestamp=datetime.now() )
# Build context from retrieved documents context = self._build_context(results)
# Generate analysis (using LLM or rule-based) if self.llm_client: analysis = self._llm_analysis(ticker, query, context) else: analysis = self._rule_based_analysis(ticker, results)
return analysis
def _build_context(self, results: List[SearchResult]) -> str: """Build context string from search results.""" context_parts = []
for i, result in enumerate(results, 1): context_parts.append( f"[{i}] Source: {result.document.source} " f"({result.document.date.strftime('%Y-%m-%d')})\n" f"{result.document.text[:500]}..." )
return "\n\n".join(context_parts)
def _rule_based_analysis( self, ticker: str, results: List[SearchResult] ) -> TradingSignal: """ Simple rule-based sentiment analysis.
Used when LLM is not available. """ positive_keywords = [ "beat", "exceeded", "growth", "upgrade", "bullish", "strong", "positive", "increase", "surge", "rally" ] negative_keywords = [ "miss", "below", "decline", "downgrade", "bearish", "weak", "negative", "decrease", "drop", "fall" ]
positive_count = 0 negative_count = 0
for result in results: text_lower = result.document.text.lower() positive_count += sum( 1 for word in positive_keywords if word in text_lower ) negative_count += sum( 1 for word in negative_keywords if word in text_lower )
total = positive_count + negative_count
if total == 0: direction = SignalDirection.NEUTRAL confidence = 0.3 elif positive_count > negative_count: direction = SignalDirection.LONG confidence = min(0.9, 0.5 + (positive_count - negative_count) / total * 0.4) else: direction = SignalDirection.SHORT confidence = min(0.9, 0.5 + (negative_count - positive_count) / total * 0.4)
sources = list(set(r.document.source for r in results))
return TradingSignal( ticker=ticker, direction=direction, confidence=confidence, reasoning=( f"Based on {len(results)} documents: " f"{positive_count} positive signals, {negative_count} negative signals." ), sources=sources, timestamp=datetime.now() )
def _llm_analysis( self, ticker: str, query: str, context: str ) -> TradingSignal: """Generate analysis using LLM.""" prompt = f"""Analyze the following financial documents and generate a trading signal.
Ticker: {ticker}Query: {query}
Retrieved Documents:{context}
Based on the above documents, provide:1. Trading direction (LONG, SHORT, or NEUTRAL)2. Confidence level (0-100%)3. Brief reasoning (2-3 sentences)
Format your response as:DIRECTION: [direction]CONFIDENCE: [percentage]REASONING: [your analysis]"""
response = self.llm_client.generate(prompt) return self._parse_llm_response(ticker, response, context)03: Portfolio Analysis with RAG
"""Example 03: Portfolio-Level RAG Analysis
Analyze entire portfolios using RAG for holistic insights."""
@dataclassclass PortfolioPosition: """Single portfolio position.""" ticker: str shares: float entry_price: float current_price: float
@dataclassclass PortfolioAnalysis: """Complete portfolio analysis.""" total_value: float risk_assessment: str sector_exposure: Dict[str, float] key_risks: List[str] opportunities: List[str] recommended_actions: List[str] sources_used: int
class PortfolioRAGAnalyzer: """ Analyze portfolio using RAG for comprehensive insights. """
def __init__( self, retriever: FinancialDocumentRetriever, signal_generator: RAGTradingSignalGenerator ): self.retriever = retriever self.signal_generator = signal_generator
def analyze_portfolio( self, positions: List[PortfolioPosition] ) -> PortfolioAnalysis: """ Perform comprehensive portfolio analysis.
Args: positions: List of portfolio positions
Returns: Complete portfolio analysis with recommendations """ # Calculate basic metrics total_value = sum( pos.shares * pos.current_price for pos in positions )
# Analyze each position position_signals = {} all_risks = [] all_opportunities = [] sources_count = 0
for position in positions: # Get signal for each position signal = self.signal_generator.generate_signal(position.ticker) position_signals[position.ticker] = signal sources_count += len(signal.sources)
# Retrieve risk-specific documents risk_results = self.retriever.search( query=f"risks and challenges for {position.ticker}", ticker=position.ticker, top_k=3 )
for result in risk_results: all_risks.append({ "ticker": position.ticker, "risk": result.highlights[0] if result.highlights else result.document.text[:100] })
# Retrieve opportunity documents opp_results = self.retriever.search( query=f"growth opportunities and catalysts for {position.ticker}", ticker=position.ticker, top_k=3 )
for result in opp_results: all_opportunities.append({ "ticker": position.ticker, "opportunity": result.highlights[0] if result.highlights else result.document.text[:100] })
# Generate recommendations recommendations = self._generate_recommendations( positions, position_signals )
# Assess overall risk risk_assessment = self._assess_risk(position_signals)
return PortfolioAnalysis( total_value=total_value, risk_assessment=risk_assessment, sector_exposure=self._calculate_sector_exposure(positions), key_risks=[r["risk"] for r in all_risks[:5]], opportunities=[o["opportunity"] for o in all_opportunities[:5]], recommended_actions=recommendations, sources_used=sources_count )
def _generate_recommendations( self, positions: List[PortfolioPosition], signals: Dict[str, TradingSignal] ) -> List[str]: """Generate actionable recommendations.""" recommendations = []
for position in positions: signal = signals.get(position.ticker) if not signal: continue
pnl_pct = (position.current_price - position.entry_price) / position.entry_price
if signal.direction == SignalDirection.SHORT and signal.confidence > 0.7: recommendations.append( f"Consider reducing {position.ticker} position " f"(Signal: {signal.direction.value}, Confidence: {signal.confidence:.0%})" ) elif signal.direction == SignalDirection.LONG and signal.confidence > 0.7: if pnl_pct < 0: recommendations.append( f"Consider averaging down on {position.ticker} " f"(Signal: {signal.direction.value}, Confidence: {signal.confidence:.0%})" ) else: recommendations.append( f"Hold {position.ticker}, bullish outlook " f"(Signal: {signal.direction.value}, Confidence: {signal.confidence:.0%})" )
return recommendations
def _assess_risk(self, signals: Dict[str, TradingSignal]) -> str: """Assess overall portfolio risk.""" bearish_count = sum( 1 for s in signals.values() if s.direction == SignalDirection.SHORT ) bullish_count = sum( 1 for s in signals.values() if s.direction == SignalDirection.LONG )
if bearish_count > bullish_count: return "HIGH - Multiple positions showing bearish signals" elif bearish_count == bullish_count: return "MODERATE - Mixed signals across positions" else: return "LOW - Majority of positions showing bullish signals"
def _calculate_sector_exposure( self, positions: List[PortfolioPosition] ) -> Dict[str, float]: """Calculate sector exposure (simplified).""" # In production, would use actual sector mappings return {"Technology": 0.4, "Healthcare": 0.3, "Finance": 0.3}04: Backtesting RAG Signals
"""Example 04: Backtesting RAG-Generated Trading Signals
Backtest trading strategies based on RAG-generated signals."""
@dataclassclass BacktestResult: """Results from backtesting.""" total_return: float sharpe_ratio: float max_drawdown: float win_rate: float total_trades: int avg_trade_return: float
class RAGBacktester: """ Backtest RAG-based trading strategies. """
def __init__( self, signal_generator: RAGTradingSignalGenerator, initial_capital: float = 100000.0 ): self.signal_generator = signal_generator self.initial_capital = initial_capital
def backtest( self, ticker: str, price_data: pd.DataFrame, signal_dates: List[datetime] ) -> BacktestResult: """ Backtest a RAG-based strategy.
Args: ticker: Stock ticker price_data: OHLCV DataFrame with DatetimeIndex signal_dates: Dates to generate signals
Returns: Backtest results with metrics """ capital = self.initial_capital position = 0 # Number of shares trades = [] equity_curve = [capital]
for date in signal_dates: if date not in price_data.index: continue
price = price_data.loc[date, 'close']
# Generate signal for this date signal = self.signal_generator.generate_signal(ticker)
# Execute trades based on signal if signal.direction == SignalDirection.LONG and position == 0: # Buy shares_to_buy = int(capital * 0.95 / price) # 95% of capital if shares_to_buy > 0: position = shares_to_buy capital -= shares_to_buy * price trades.append({ "date": date, "action": "BUY", "shares": shares_to_buy, "price": price, "confidence": signal.confidence })
elif signal.direction == SignalDirection.SHORT and position > 0: # Sell capital += position * price trades.append({ "date": date, "action": "SELL", "shares": position, "price": price, "confidence": signal.confidence }) position = 0
# Update equity equity = capital + position * price equity_curve.append(equity)
# Close any remaining position if position > 0 and len(price_data) > 0: final_price = price_data.iloc[-1]['close'] capital += position * final_price
# Calculate metrics total_return = (capital - self.initial_capital) / self.initial_capital
equity_series = pd.Series(equity_curve) returns = equity_series.pct_change().dropna()
sharpe_ratio = ( returns.mean() / returns.std() * np.sqrt(252) if returns.std() > 0 else 0 )
rolling_max = equity_series.cummax() drawdown = (equity_series - rolling_max) / rolling_max max_drawdown = drawdown.min()
# Calculate win rate buy_prices = [t["price"] for t in trades if t["action"] == "BUY"] sell_prices = [t["price"] for t in trades if t["action"] == "SELL"]
wins = sum( 1 for b, s in zip(buy_prices, sell_prices) if s > b ) total_completed_trades = min(len(buy_prices), len(sell_prices)) win_rate = wins / total_completed_trades if total_completed_trades > 0 else 0
trade_returns = [ (s - b) / b for b, s in zip(buy_prices, sell_prices) ] avg_trade_return = np.mean(trade_returns) if trade_returns else 0
return BacktestResult( total_return=total_return, sharpe_ratio=sharpe_ratio, max_drawdown=max_drawdown, win_rate=win_rate, total_trades=len(trades), avg_trade_return=avg_trade_return )Rust Implementation
See the rust_rag_trading/ directory for the complete Rust implementation featuring:
- Async/await support with Tokio for high-performance I/O
- Vector similarity search using efficient SIMD operations
- Document processing with chunking and metadata extraction
- Bybit API integration for cryptocurrency data
- Yahoo Finance data loading for stock market data
// Example usage of Rust RAG implementationuse rag_trading::{ DocumentRetriever, RAGSignalGenerator, Document, BybitDataLoader, YahooFinanceLoader};
#[tokio::main]async fn main() -> Result<(), Box<dyn std::error::Error>> { // Initialize retriever let mut retriever = DocumentRetriever::new()?;
// Add documents retriever.add_documents(vec![ Document::new( "Tesla reported record Q4 deliveries...", Some("TSLA"), "news", chrono::Utc::now() ), ]).await?;
// Generate signal let generator = RAGSignalGenerator::new(retriever); let signal = generator.generate_signal("TSLA").await?;
println!("Signal: {:?}", signal);
// Load market data for backtesting let yahoo = YahooFinanceLoader::new(); let tsla_data = yahoo.load("TSLA", "1y").await?;
let bybit = BybitDataLoader::new(false); let btc_data = bybit.load("BTCUSDT", 30).await?;
Ok(())}Python Implementation
See the python/ directory for the complete Python implementation including:
retriever.py: Document retrieval with semantic searchsignals.py: Trading signal generationbacktest.py: Backtesting frameworkdata_loader.py: Yahoo Finance and Bybit data loadingexamples/: Demo scripts
# Example usagefrom rag_trading import ( FinancialDocumentRetriever, RAGTradingSignalGenerator, RAGBacktester, DataLoader)
# Initialize componentsretriever = FinancialDocumentRetriever()generator = RAGTradingSignalGenerator(retriever)
# Add documentsretriever.add_documents([ Document( id="doc_001", text="Tesla reported record Q4 deliveries of 484,507 vehicles...", ticker="TSLA", source="Reuters", date=datetime.now(), doc_type="news" )])
# Generate signalsignal = generator.generate_signal("TSLA")print(f"Signal: {signal.direction.value}, Confidence: {signal.confidence:.0%}")
# Load market dataloader = DataLoader()tsla_data = loader.load("TSLA", source="yahoo", period="1y")btc_data = loader.load("BTCUSDT", source="bybit", days=30)
# Backtestbacktester = RAGBacktester(generator)results = backtester.backtest("TSLA", tsla_data.ohlcv, signal_dates)print(f"Total Return: {results.total_return:.2%}")Best Practices
1. Document Quality
DOCUMENT QUALITY CHECKLIST:✓ Remove boilerplate (disclaimers, legal text)✓ Normalize dates to consistent format✓ Extract and validate ticker symbols✓ Remove duplicate or near-duplicate documents✓ Verify source reliability✓ Tag with document type and date2. Chunking Strategy
# Recommended chunking strategies for financial documentsCHUNKING_STRATEGIES = { "news": { "chunk_size": 512, "overlap": 50, "strategy": "paragraph" # Split by paragraphs }, "10-K": { "chunk_size": 1024, "overlap": 100, "strategy": "section" # Split by SEC sections }, "earnings_call": { "chunk_size": 768, "overlap": 75, "strategy": "speaker" # Split by speaker turns }}3. Retrieval Optimization
- Use hybrid search (semantic + keyword) for better results
- Apply metadata filtering (date, ticker) before semantic search
- Implement re-ranking for top results
- Cache frequent queries
4. Signal Generation
- Always provide source attribution
- Include confidence scores
- Log all signal generations for analysis
- Implement position sizing based on confidence
5. Backtesting
- Use out-of-sample data for validation
- Account for look-ahead bias in document timestamps
- Include transaction costs
- Test across different market regimes
Resources
Papers
-
Retrieval-Augmented Generation for Large Language Models: A Survey
- arXiv: 2312.10997
- Comprehensive overview of RAG techniques
-
REALM: Retrieval-Augmented Language Model Pre-Training
- arXiv: 2002.08909
- Foundation paper for retrieval-augmented LMs
-
FinGPT: Open-Source Financial Large Language Models
- arXiv: 2306.06031
- Open-source financial LLM with RAG capabilities
Tools & Libraries
| Tool | Purpose | Link |
|---|---|---|
| LangChain | RAG framework | langchain.com |
| LlamaIndex | Document indexing | llamaindex.ai |
| ChromaDB | Vector store | trychroma.com |
| Sentence-Transformers | Embeddings | sbert.net |
| yfinance | Stock data | pypi.org/project/yfinance |
Data Sources
- SEC EDGAR: Free SEC filings (sec.gov/edgar)
- Yahoo Finance: Stock data via yfinance
- Bybit API: Cryptocurrency data
- Alpha Vantage: News and market data
- Polygon.io: Real-time market data
Summary
RAG for trading combines the power of LLMs with real-time information retrieval to create intelligent trading systems that:
- Stay Current: Access real-time news and filings
- Ground Responses: Base analysis on actual documents
- Provide Transparency: Cite sources for all claims
- Scale Efficiently: Update knowledge without retraining
Key takeaways:
- Choose appropriate embedding models for financial domain
- Implement proper document processing pipelines
- Use hybrid retrieval for best results
- Always backtest strategies before deployment
- Monitor and log all signals for continuous improvement
The code examples in this chapter provide a foundation for building production-grade RAG systems for trading applications.