Chapter 365: Foundation Models for Algorithmic Trading
Chapter 365: Foundation Models for Algorithmic Trading
Overview
Foundation models represent the culmination of a decade-long evolution in machine learning for financial markets — from hand-crafted features and linear models through deep learning architectures to large-scale pretrained models capable of zero-shot and few-shot generalization across diverse forecasting tasks. This final chapter of the book synthesizes the key themes explored throughout preceding chapters: representation learning, temporal modeling, domain adaptation, uncertainty quantification, and multi-modal reasoning — all converging in foundation models that promise to transform algorithmic trading from a bespoke engineering discipline into a transfer learning paradigm.
The foundation model landscape for time series and trading encompasses several complementary approaches. Dedicated time series foundation models — including Google’s TimesFM, Salesforce’s Moirai, Amazon’s Chronos, and the open-source Lag-Llama — achieve remarkable zero-shot forecasting performance by pretraining on vast collections of time series from diverse domains. Simultaneously, large language models (LLMs) augmented with financial domain knowledge have demonstrated emergent capabilities in market analysis, news interpretation, and even direct trading decision-making through chain-of-thought reasoning. The convergence of these approaches — multi-modal foundation models that jointly process numerical time series, textual market intelligence, and structured financial data — represents the frontier of research in this field.
This chapter provides a comprehensive treatment of foundation models for algorithmic trading, covering the architectural innovations that enable zero-shot time series forecasting, the fine-tuning strategies that adapt general-purpose models to crypto market specifics, and the agent-based frameworks that leverage LLM reasoning for autonomous trading. We implement complete systems in Python and Rust using the Bybit API, demonstrate practical performance across multiple cryptocurrency pairs, and critically evaluate the promises and limitations of foundation model approaches in the context of efficient market dynamics. As the concluding chapter, we also reflect on the broader trajectory of the field and outline the most promising directions for future research.
Table of Contents
- Introduction to Foundation Models for Trading
- Mathematical Foundations of Time Series Foundation Models
- Model Comparison: Foundation Models vs Specialized Architectures
- Trading Applications of Foundation Models
- Python Implementation
- Rust Implementation
- Practical Examples
- Backtesting Framework
- Performance Evaluation
- Future Directions and References
1. Introduction to Foundation Models for Trading
The concept of foundation models — large-scale models pretrained on broad data and adaptable to a wide range of downstream tasks — was formalized by Bommasani et al. (2021) at Stanford’s Center for Research on Foundation Models (CRFM). In natural language processing, models like GPT-4, Claude, and Llama demonstrated that pretraining on internet-scale text corpora produces representations that transfer effectively to virtually any language task. The analogous question for financial markets is: can we build foundation models that learn universal patterns of temporal dynamics, enabling zero-shot or few-shot forecasting across assets, frequencies, and market regimes?
The answer, emerging from research in 2023-2025, is nuanced. Time series foundation models achieve strong zero-shot performance on standardized benchmarks, often matching or exceeding supervised baselines trained on target-specific data. However, financial time series present unique challenges: low signal-to-noise ratios, non-stationarity, adversarial dynamics (markets adapt to predictive strategies), and regime dependence that violates the assumption of distributional similarity between pretraining and deployment domains.
Key Foundation Model Categories
Time Series Foundation Models:
- TimesFM (Google, 2024): Decoder-only Transformer pretrained on 100B time points from Google Trends, Wiki pageviews, and synthetic data. Supports variable-length context and arbitrary prediction horizons.
- Moirai (Salesforce, 2024): Any-variate, any-frequency foundation model using mixture distributions for probabilistic forecasting. Pretrained on LOTSA (Large-scale Open Time Series Archive) with 27B observations.
- Chronos (Amazon, 2024): Tokenizes time series values into discrete bins and frames forecasting as language modeling. Based on T5 architecture with bin-wise cross-entropy training.
- Lag-Llama (2024): Probabilistic foundation model using lagged features and a Llama-inspired architecture with distributional heads for uncertainty quantification.
LLM-Based Trading Agents:
- FinGPT: Open-source financial LLM for sentiment analysis, report generation, and trading signal extraction.
- Trading Agents: LLM-orchestrated systems that combine market data analysis, news interpretation, and portfolio optimization through chain-of-thought reasoning and tool use.
- Multi-Modal Models: Systems that jointly process price charts (vision), market text (language), and numerical features (time series) for holistic market understanding.
2. Mathematical Foundations of Time Series Foundation Models
2.1 Patched Time Series Tokenization
TimesFM and similar models convert continuous time series into discrete patch tokens:
$$\mathbf{p}i = \text{Linear}(\mathbf{x}{i \cdot P : (i+1) \cdot P}) \in \mathbb{R}^d$$
where $P$ is the patch length, $d$ is the model dimension, and $\mathbf{x}_{a:b}$ denotes the time series segment from index $a$ to $b$. The sequence of patches $[\mathbf{p}_1, \mathbf{p}_2, \ldots, \mathbf{p}_N]$ is then processed by a Transformer.
2.2 Decoder-Only Autoregressive Forecasting
The forecasting objective follows the standard language modeling formulation:
$$\mathcal{L} = -\sum_{i=1}^{N} \log p_\theta(\mathbf{p}i | \mathbf{p}{<i})$$
For point forecasting, the output head predicts the next patch:
$$\hat{\mathbf{p}}_{N+1} = \text{OutputHead}(\mathbf{h}_N)$$
where $\mathbf{h}_N = \text{Transformer}(\mathbf{p}_1, \ldots, \mathbf{p}_N)$ is the final hidden state.
2.3 Probabilistic Forecasting with Mixture Distributions
Moirai and Lag-Llama use mixture distribution heads for uncertainty quantification:
$$p(y_{t+h} | \mathbf{x}{1:t}) = \sum{k=1}^{K} \pi_k \cdot \mathcal{D}_k(\mu_k, \sigma_k)$$
where $\pi_k$ are mixture weights, and $\mathcal{D}_k$ can be Gaussian, Student-t, or other distributions. Parameters are predicted from the Transformer output:
$$[\boldsymbol{\pi}, \boldsymbol{\mu}, \boldsymbol{\sigma}] = \text{MixtureHead}(\mathbf{h}_t)$$
2.4 Chronos Bin-Based Tokenization
Chronos discretizes time series values into $B$ bins using quantile-based boundaries:
$$b(x) = \arg\min_{k \in {1,\ldots,B}} |x - c_k|$$
where $c_k$ are bin centers. The forecasting loss becomes cross-entropy over bin probabilities:
$$\mathcal{L}{\text{Chronos}} = -\sum{t} \log p_\theta(b(x_{t+1}) | b(x_1), \ldots, b(x_t))$$
This approach enables the use of standard language model architectures (T5) without modification.
2.5 Transfer Learning and Domain Adaptation
The fine-tuning objective for crypto markets combines the pretrained model’s general knowledge with domain-specific data:
$$\mathcal{L}{\text{finetune}} = \mathcal{L}{\text{forecast}}(\theta; \mathcal{D}_{\text{crypto}}) + \lambda |\theta - \theta_0|^2$$
where $\theta_0$ are the pretrained weights and $\lambda$ controls regularization strength (preventing catastrophic forgetting, as discussed in Chapter 284). Low-rank adaptation (LoRA) provides parameter-efficient fine-tuning:
$$W = W_0 + \alpha \cdot BA$$
where $B \in \mathbb{R}^{d \times r}$, $A \in \mathbb{R}^{r \times d}$, and $r \ll d$ is the rank.
2.6 LLM Trading Agent Framework
The agent-based trading framework decomposes trading into a reasoning chain:
$$\text{Action} = \text{LLM}(\text{Prompt}(\text{MarketData}, \text{News}, \text{Portfolio}, \text{Strategy}))$$
With tool use, the agent can execute:
- Observe: Query market data APIs (Bybit) for prices, order books, funding rates
- Analyze: Process observations through chain-of-thought reasoning
- Decide: Generate trading actions with confidence scores
- Execute: Place orders through exchange APIs
- Reflect: Evaluate outcomes and update strategy
3. Model Comparison: Foundation Models vs Specialized Architectures
| Property | TimesFM | Moirai | Chronos | Lag-Llama | LSTM | LNN (Ch. 363) |
|---|---|---|---|---|---|---|
| Parameters | 200M | 311M | 710M (large) | 7M | 50K-500K | 5K-50K |
| Pretraining data | 100B pts | 27B pts | 84B pts | 7M series | N/A | N/A |
| Zero-shot | Yes | Yes | Yes | Yes | No | No |
| Probabilistic | No | Yes | Via bins | Yes | No | No |
| Multi-variate | No | Yes | No | No | Yes | Yes |
| Fine-tuning | LoRA | Full/LoRA | Full | LoRA | Full | Full |
| Inference (ms) | ~50 | ~80 | ~120 | ~30 | ~1 | ~0.3 |
| Context length | 512 patches | 5000 | 512 tokens | 32 lags | 256 | 64 |
| Regime adaptivity | Implicit | Implicit | Implicit | Implicit | Learned | Dynamic tau |
| Crypto-specific | No | No | No | No | Trainable | Trainable |
4. Trading Applications of Foundation Models
4.1 Zero-Shot Crypto Price Forecasting
Foundation models can generate price forecasts for any cryptocurrency pair without training on that specific asset. By leveraging patterns learned across millions of diverse time series during pretraining, models like TimesFM and Chronos capture universal temporal dynamics — mean reversion, momentum, volatility clustering — that transfer to crypto markets. This enables rapid deployment on newly listed tokens with no historical data available for model training.
4.2 Probabilistic Risk Assessment
Moirai and Lag-Llama’s mixture distribution outputs provide native uncertainty quantification, generating full predictive distributions rather than point forecasts. These distributions enable direct computation of Value-at-Risk (VaR), Expected Shortfall (ES), and prediction intervals without the conformal calibration techniques discussed in Chapter 330. The probabilistic outputs naturally support risk-aware position sizing.
4.3 Multi-Modal Market Analysis
Combining time series foundation models with LLMs enables multi-modal analysis that processes numerical price data alongside textual information (social media sentiment, regulatory announcements, protocol upgrades). This integration captures information channels that purely numerical models miss, particularly for event-driven crypto markets where governance proposals, hacks, or regulatory changes drive price action.
4.4 LLM-Orchestrated Trading Agents
Large language models serve as the reasoning backbone for autonomous trading agents that decompose complex market scenarios into structured analysis pipelines. These agents combine real-time data retrieval from Bybit, technical analysis computation, fundamental research, and portfolio optimization into coherent trading plans articulated through chain-of-thought reasoning.
4.5 Cross-Asset Transfer and Few-Shot Adaptation
Foundation models pretrained on diverse time series enable few-shot adaptation to new crypto assets or market regimes. With as few as 100-500 data points, fine-tuned foundation models match or exceed fully supervised baselines trained on thousands of samples, dramatically reducing the data requirements for deploying strategies on new instruments.
5. Python Implementation
5.1 Bybit Data Pipeline for Foundation Models
import numpy as npimport pandas as pdimport requestsimport torchimport torch.nn as nnfrom typing import List, Dict, Tuple, Optionalfrom dataclasses import dataclass, fieldfrom abc import ABC, abstractmethod
@dataclassclass FoundationModelConfig: """Configuration for foundation model trading system.""" symbols: List[str] = field(default_factory=lambda: ["BTCUSDT", "ETHUSDT", "SOLUSDT"]) interval: str = "60" context_length: int = 512 prediction_horizon: int = 24 base_url: str = "https://api.bybit.com" model_type: str = "timesfm"
class BybitFoundationDataCollector: """Collects market data from Bybit for foundation model inference."""
def __init__(self, config: FoundationModelConfig): self.config = config self.session = requests.Session()
def fetch_klines(self, symbol: str, interval: str = None, limit: int = 1000) -> pd.DataFrame: """Fetch kline data from Bybit API v5.""" endpoint = f"{self.config.base_url}/v5/market/kline" params = { "category": "linear", "symbol": symbol, "interval": interval or self.config.interval, "limit": min(limit, 1000) }
response = self.session.get(endpoint, params=params) data = response.json()
if data["retCode"] != 0: raise ValueError(f"Bybit API error: {data['retMsg']}")
rows = data["result"]["list"] df = pd.DataFrame(rows, columns=[ "timestamp", "open", "high", "low", "close", "volume", "turnover" ]) for col in ["open", "high", "low", "close", "volume", "turnover"]: df[col] = df[col].astype(float) df["timestamp"] = pd.to_datetime(df["timestamp"].astype(int), unit="ms") df = df.sort_values("timestamp").reset_index(drop=True) return df
def fetch_funding_rate(self, symbol: str) -> Dict: """Fetch current funding rate from Bybit.""" endpoint = f"{self.config.base_url}/v5/market/funding/history" params = {"category": "linear", "symbol": symbol, "limit": "1"} response = self.session.get(endpoint, params=params) data = response.json()
if data["retCode"] == 0 and data["result"]["list"]: entry = data["result"]["list"][0] return { "funding_rate": float(entry["fundingRate"]), "timestamp": int(entry["fundingRateTimestamp"]) } return {"funding_rate": 0.0, "timestamp": 0}
def fetch_orderbook_snapshot(self, symbol: str, depth: int = 25) -> Dict: """Fetch orderbook snapshot for market microstructure features.""" endpoint = f"{self.config.base_url}/v5/market/orderbook" params = {"category": "linear", "symbol": symbol, "limit": str(depth)} response = self.session.get(endpoint, params=params) data = response.json()
if data["retCode"] == 0: result = data["result"] bids = [(float(p), float(q)) for p, q in result["b"]] asks = [(float(p), float(q)) for p, q in result["a"]]
bid_volume = sum(q for _, q in bids) ask_volume = sum(q for _, q in asks) imbalance = (bid_volume - ask_volume) / (bid_volume + ask_volume + 1e-8) spread = asks[0][0] - bids[0][0] if bids and asks else 0.0
return { "bid_volume": bid_volume, "ask_volume": ask_volume, "imbalance": imbalance, "spread": spread, "mid_price": (bids[0][0] + asks[0][0]) / 2 if bids and asks else 0.0 } return {}
def prepare_context_window(self, df: pd.DataFrame, column: str = "close") -> np.ndarray: """Prepare context window for foundation model input.""" values = df[column].values[-self.config.context_length:]
# Normalize to zero mean, unit variance mean = np.mean(values) std = np.std(values) if std < 1e-8: std = 1.0 normalized = (values - mean) / std
return normalized, mean, std5.2 Foundation Model Wrappers
class BaseFoundationModel(ABC): """Abstract base class for time series foundation models."""
@abstractmethod def forecast(self, context: np.ndarray, horizon: int) -> Dict: """Generate forecast from context window.""" pass
@abstractmethod def forecast_probabilistic(self, context: np.ndarray, horizon: int) -> Dict: """Generate probabilistic forecast with uncertainty.""" pass
class TimesFMWrapper(BaseFoundationModel): """Wrapper for Google's TimesFM foundation model."""
def __init__(self, model_path: str = "google/timesfm-1.0-200m", device: str = "auto"): self.device = device self.model = None self.model_path = model_path
def load_model(self): """Load TimesFM model (requires timesfm package).""" try: import timesfm self.model = timesfm.TimesFm( context_len=512, horizon_len=128, input_patch_len=32, output_patch_len=128, num_layers=20, model_dims=1280, backend="gpu" if torch.cuda.is_available() else "cpu" ) self.model.load_from_checkpoint(repo_id=self.model_path) except ImportError: print("TimesFM not installed. Using mock forecaster.") self.model = None
def forecast(self, context: np.ndarray, horizon: int) -> Dict: """Generate point forecast.""" if self.model is None: # Mock forecast for demonstration last_val = context[-1] noise = np.random.randn(horizon) * 0.01 trend = np.linspace(0, 0.01, horizon) predictions = last_val + trend + noise return {"point_forecast": predictions}
forecasts, _ = self.model.forecast( [context.tolist()], freq=[0] # 0 = high frequency ) return {"point_forecast": np.array(forecasts[0][:horizon])}
def forecast_probabilistic(self, context: np.ndarray, horizon: int) -> Dict: """Generate pseudo-probabilistic forecast via sampling.""" forecasts = [] for _ in range(100): # Add noise to context for bootstrap uncertainty noisy_context = context + np.random.randn(len(context)) * 0.005 result = self.forecast(noisy_context, horizon) forecasts.append(result["point_forecast"])
forecasts = np.array(forecasts) return { "point_forecast": np.median(forecasts, axis=0), "lower_90": np.percentile(forecasts, 5, axis=0), "upper_90": np.percentile(forecasts, 95, axis=0), "lower_50": np.percentile(forecasts, 25, axis=0), "upper_50": np.percentile(forecasts, 75, axis=0), "std": np.std(forecasts, axis=0) }
class ChronosWrapper(BaseFoundationModel): """Wrapper for Amazon's Chronos foundation model."""
def __init__(self, model_size: str = "small"): self.model_size = model_size self.model = None self.tokenizer = None self.model_name = f"amazon/chronos-t5-{model_size}"
def load_model(self): """Load Chronos model (requires chronos-forecasting package).""" try: from chronos import ChronosPipeline self.model = ChronosPipeline.from_pretrained( self.model_name, device_map="auto", torch_dtype=torch.float32 ) except ImportError: print("Chronos not installed. Using mock forecaster.") self.model = None
def forecast(self, context: np.ndarray, horizon: int) -> Dict: """Generate point forecast.""" if self.model is None: last_val = context[-1] noise = np.random.randn(horizon) * 0.01 predictions = last_val + np.cumsum(noise) return {"point_forecast": predictions}
context_tensor = torch.tensor(context, dtype=torch.float32).unsqueeze(0) forecast = self.model.predict( context_tensor, horizon, num_samples=20 ) median_forecast = forecast.median(dim=1).values.numpy()[0] return {"point_forecast": median_forecast}
def forecast_probabilistic(self, context: np.ndarray, horizon: int) -> Dict: """Generate probabilistic forecast.""" if self.model is None: point = self.forecast(context, horizon)["point_forecast"] std = np.abs(point) * 0.02 return { "point_forecast": point, "lower_90": point - 1.645 * std, "upper_90": point + 1.645 * std, "std": std }
context_tensor = torch.tensor(context, dtype=torch.float32).unsqueeze(0) samples = self.model.predict( context_tensor, horizon, num_samples=200 ).numpy()[0]
return { "point_forecast": np.median(samples, axis=0), "lower_90": np.percentile(samples, 5, axis=0), "upper_90": np.percentile(samples, 95, axis=0), "lower_50": np.percentile(samples, 25, axis=0), "upper_50": np.percentile(samples, 75, axis=0), "std": np.std(samples, axis=0) }
class LagLlamaWrapper(BaseFoundationModel): """Wrapper for Lag-Llama probabilistic foundation model."""
def __init__(self, model_path: str = "time-series-foundation-models/Lag-Llama"): self.model_path = model_path self.model = None
def load_model(self): """Load Lag-Llama model.""" try: from lag_llama.gluon.estimator import LagLlamaEstimator self.model = LagLlamaEstimator( prediction_length=24, context_length=32, num_layers=8, hidden_size=256, num_attention_heads=4 ) except ImportError: print("Lag-Llama not installed. Using mock forecaster.") self.model = None
def forecast(self, context: np.ndarray, horizon: int) -> Dict: """Generate point forecast.""" last_val = context[-1] ar_coef = 0.95 predictions = np.zeros(horizon) predictions[0] = ar_coef * last_val + np.random.randn() * 0.01 for i in range(1, horizon): predictions[i] = ar_coef * predictions[i-1] + np.random.randn() * 0.01 return {"point_forecast": predictions}
def forecast_probabilistic(self, context: np.ndarray, horizon: int) -> Dict: point = self.forecast(context, horizon)["point_forecast"] std = np.linspace(0.01, 0.03, horizon) return { "point_forecast": point, "lower_90": point - 1.645 * std, "upper_90": point + 1.645 * std, "std": std }5.3 LLM Trading Agent
import jsonfrom datetime import datetime
class LLMTradingAgent: """LLM-orchestrated trading agent with Bybit integration."""
def __init__(self, config: FoundationModelConfig, llm_backend: str = "local"): self.config = config self.collector = BybitFoundationDataCollector(config) self.llm_backend = llm_backend self.portfolio = {"cash": 100_000.0, "positions": {}} self.trade_history = [] self.reasoning_log = []
def observe_market(self, symbol: str) -> Dict: """Gather comprehensive market observations.""" # Price data df = self.collector.fetch_klines(symbol, limit=200)
# Compute technical indicators df["return_1h"] = df["close"].pct_change() df["sma_20"] = df["close"].rolling(20).mean() df["sma_50"] = df["close"].rolling(50).mean() df["rsi"] = self._compute_rsi(df["close"], 14) df["volatility"] = df["return_1h"].rolling(24).std() * np.sqrt(24 * 365)
latest = df.iloc[-1]
# Orderbook data orderbook = self.collector.fetch_orderbook_snapshot(symbol)
# Funding rate funding = self.collector.fetch_funding_rate(symbol)
observation = { "symbol": symbol, "timestamp": str(latest["timestamp"]), "price": latest["close"], "return_1h": latest.get("return_1h", 0), "return_24h": df["close"].pct_change(24).iloc[-1], "sma_20": latest.get("sma_20", 0), "sma_50": latest.get("sma_50", 0), "rsi": latest.get("rsi", 50), "volatility_24h": latest.get("volatility", 0), "volume_24h": df["volume"].tail(24).sum(), "orderbook_imbalance": orderbook.get("imbalance", 0), "spread_bps": orderbook.get("spread", 0) / latest["close"] * 10000, "funding_rate": funding.get("funding_rate", 0), "trend": "bullish" if latest.get("sma_20", 0) > latest.get("sma_50", 0) else "bearish" }
return observation
def analyze_and_decide(self, observations: Dict[str, Dict]) -> Dict: """Use structured reasoning to generate trading decisions.""" analysis_prompt = self._build_analysis_prompt(observations)
# Structured analysis (simulating LLM reasoning) decisions = {} for symbol, obs in observations.items(): score = 0.0 reasoning = []
# Trend analysis if obs["trend"] == "bullish": score += 0.2 reasoning.append(f"Bullish trend (SMA20 > SMA50)") else: score -= 0.2 reasoning.append(f"Bearish trend (SMA20 < SMA50)")
# RSI analysis if obs["rsi"] < 30: score += 0.3 reasoning.append(f"Oversold RSI ({obs['rsi']:.1f})") elif obs["rsi"] > 70: score -= 0.3 reasoning.append(f"Overbought RSI ({obs['rsi']:.1f})")
# Orderbook imbalance if obs["orderbook_imbalance"] > 0.2: score += 0.15 reasoning.append(f"Positive order imbalance ({obs['orderbook_imbalance']:.3f})") elif obs["orderbook_imbalance"] < -0.2: score -= 0.15 reasoning.append(f"Negative order imbalance ({obs['orderbook_imbalance']:.3f})")
# Funding rate analysis if obs["funding_rate"] > 0.001: score -= 0.1 reasoning.append(f"High positive funding ({obs['funding_rate']:.4f}) - crowded long") elif obs["funding_rate"] < -0.001: score += 0.1 reasoning.append(f"Negative funding ({obs['funding_rate']:.4f}) - crowded short")
# Volatility-adjusted position size vol_scalar = max(0.5, min(2.0, 0.3 / max(obs["volatility_24h"], 0.01)))
if abs(score) > 0.25: action = "long" if score > 0 else "short" confidence = min(abs(score), 1.0) position_size = confidence * vol_scalar * 0.1 else: action = "hold" confidence = 1.0 - abs(score) position_size = 0.0
decisions[symbol] = { "action": action, "confidence": confidence, "position_size": position_size, "reasoning": reasoning, "composite_score": score }
self.reasoning_log.append({ "timestamp": datetime.now().isoformat(), "observations": observations, "decisions": decisions })
return decisions
def execute_decisions(self, decisions: Dict) -> List[Dict]: """Execute trading decisions (simulation mode).""" trades = [] for symbol, decision in decisions.items(): if decision["action"] == "hold": continue
current_pos = self.portfolio["positions"].get(symbol, 0.0) target_value = decision["position_size"] * self.portfolio["cash"]
if decision["action"] == "long": trade_value = target_value - current_pos elif decision["action"] == "short": trade_value = -target_value - current_pos else: continue
if abs(trade_value) > 100: # Minimum trade size cost = abs(trade_value) * 0.00075 # 7.5 bps self.portfolio["cash"] -= cost self.portfolio["positions"][symbol] = ( self.portfolio["positions"].get(symbol, 0.0) + trade_value )
trade = { "symbol": symbol, "action": decision["action"], "value": trade_value, "cost": cost, "confidence": decision["confidence"], "reasoning": decision["reasoning"] } trades.append(trade) self.trade_history.append(trade)
return trades
def _compute_rsi(self, prices: pd.Series, period: int = 14) -> pd.Series: delta = prices.diff() gain = delta.where(delta > 0, 0.0).rolling(period).mean() loss = (-delta.where(delta < 0, 0.0)).rolling(period).mean() rs = gain / loss.clip(lower=1e-8) return 100 - (100 / (1 + rs))
def _build_analysis_prompt(self, observations: Dict) -> str: prompt = "Analyze the following crypto market data and recommend positions:\n\n" for symbol, obs in observations.items(): prompt += f"## {symbol}\n" prompt += f"- Price: ${obs['price']:,.2f}\n" prompt += f"- 24h Return: {obs['return_24h']*100:.2f}%\n" prompt += f"- RSI: {obs['rsi']:.1f}\n" prompt += f"- Trend: {obs['trend']}\n" prompt += f"- Volatility: {obs['volatility_24h']*100:.1f}%\n" prompt += f"- Funding Rate: {obs['funding_rate']:.4f}\n\n" return prompt
class FoundationModelEnsemble: """Ensemble of foundation models for robust forecasting."""
def __init__(self, models: Dict[str, BaseFoundationModel], weights: Optional[Dict[str, float]] = None): self.models = models self.weights = weights or {name: 1.0/len(models) for name in models}
def forecast_ensemble(self, context: np.ndarray, horizon: int) -> Dict: """Generate ensemble forecast from multiple foundation models.""" all_forecasts = {} for name, model in self.models.items(): try: result = model.forecast_probabilistic(context, horizon) all_forecasts[name] = result except Exception as e: print(f"Model {name} failed: {e}")
if not all_forecasts: raise ValueError("All models failed")
# Weighted ensemble of point forecasts ensemble_forecast = np.zeros(horizon) total_weight = 0.0 for name, result in all_forecasts.items(): w = self.weights.get(name, 0.0) ensemble_forecast += w * result["point_forecast"] total_weight += w ensemble_forecast /= max(total_weight, 1e-8)
# Uncertainty from model disagreement + individual uncertainty forecasts_array = np.array([ r["point_forecast"] for r in all_forecasts.values() ]) model_disagreement = np.std(forecasts_array, axis=0)
avg_individual_std = np.mean([ r.get("std", np.zeros(horizon)) for r in all_forecasts.values() ], axis=0)
total_std = np.sqrt(model_disagreement**2 + avg_individual_std**2)
return { "point_forecast": ensemble_forecast, "lower_90": ensemble_forecast - 1.645 * total_std, "upper_90": ensemble_forecast + 1.645 * total_std, "model_disagreement": model_disagreement, "individual_forecasts": all_forecasts, "std": total_std }
def update_weights(self, actuals: np.ndarray, forecasts: Dict[str, np.ndarray]): """Update ensemble weights based on recent performance.""" errors = {} for name, pred in forecasts.items(): mse = np.mean((actuals - pred) ** 2) errors[name] = mse
# Inverse error weighting inv_errors = {name: 1.0 / (err + 1e-8) for name, err in errors.items()} total_inv = sum(inv_errors.values()) self.weights = {name: inv / total_inv for name, inv in inv_errors.items()}5.4 Fine-Tuning Pipeline
class CryptoFineTuner: """Fine-tune foundation models on Bybit crypto data."""
def __init__(self, base_model: BaseFoundationModel, config: FoundationModelConfig): self.base_model = base_model self.config = config self.collector = BybitFoundationDataCollector(config)
def prepare_finetuning_data(self, symbols: List[str], context_len: int = 512, horizon: int = 24) -> Dict: """Prepare crypto-specific fine-tuning dataset.""" contexts, targets = [], []
for symbol in symbols: df = self.collector.fetch_klines(symbol, limit=1000) closes = df["close"].values
# Normalize returns = np.diff(np.log(closes))
for i in range(context_len, len(returns) - horizon): ctx = returns[i - context_len:i] tgt = returns[i:i + horizon] contexts.append(ctx) targets.append(tgt)
contexts = np.array(contexts) targets = np.array(targets)
# Train/val split n = len(contexts) split = int(0.8 * n)
return { "train_contexts": contexts[:split], "train_targets": targets[:split], "val_contexts": contexts[split:], "val_targets": targets[split:] }
def evaluate_zero_shot(self, symbol: str, horizon: int = 24) -> Dict: """Evaluate foundation model in zero-shot setting.""" df = self.collector.fetch_klines(symbol, limit=1000) closes = df["close"].values
predictions, actuals = [], [] context_len = self.config.context_length
for i in range(context_len, len(closes) - horizon, horizon): context = closes[i - context_len:i] actual = closes[i:i + horizon]
# Normalize context mean, std = np.mean(context), np.std(context) norm_ctx = (context - mean) / max(std, 1e-8)
result = self.base_model.forecast(norm_ctx, horizon) pred = result["point_forecast"] * std + mean
predictions.append(pred) actuals.append(actual)
predictions = np.array(predictions) actuals = np.array(actuals)
# Compute metrics mae = np.mean(np.abs(predictions - actuals)) rmse = np.sqrt(np.mean((predictions - actuals) ** 2)) mape = np.mean(np.abs((predictions - actuals) / actuals)) * 100
# Directional accuracy pred_direction = np.sign(predictions[:, -1] - predictions[:, 0]) actual_direction = np.sign(actuals[:, -1] - actuals[:, 0]) directional_acc = np.mean(pred_direction == actual_direction)
return { "mae": mae, "rmse": rmse, "mape": mape, "directional_accuracy": directional_acc, "n_windows": len(predictions) }6. Rust Implementation
6.1 Project Structure
foundation_trading/├── Cargo.toml├── src/│ ├── main.rs│ ├── bybit_client.rs│ ├── data_pipeline.rs│ ├── model_inference.rs│ ├── ensemble.rs│ ├── trading_agent.rs│ ├── backtester.rs│ └── metrics.rs└── tests/ ├── test_inference.rs └── test_agent.rs6.2 Bybit Client and Data Pipeline
use reqwest::Client;use serde::{Deserialize, Serialize};use tokio::time::{sleep, Duration};use std::collections::HashMap;
#[derive(Debug, Clone, Serialize, Deserialize)]pub struct Kline { pub timestamp: i64, pub open: f64, pub high: f64, pub low: f64, pub close: f64, pub volume: f64, pub turnover: f64,}
#[derive(Debug, Clone)]pub struct MarketObservation { pub symbol: String, pub price: f64, pub return_1h: f64, pub return_24h: f64, pub rsi: f64, pub volatility: f64, pub funding_rate: f64, pub orderbook_imbalance: f64, pub trend: Trend,}
#[derive(Debug, Clone)]pub enum Trend { Bullish, Bearish, Neutral,}
pub struct BybitFoundationClient { client: Client, base_url: String,}
impl BybitFoundationClient { pub fn new() -> Self { Self { client: Client::new(), base_url: "https://api.bybit.com".to_string(), } }
pub async fn fetch_klines( &self, symbol: &str, interval: &str, limit: u32, ) -> Result<Vec<Kline>, Box<dyn std::error::Error>> { let url = format!("{}/v5/market/kline", self.base_url); let response = self .client .get(&url) .query(&[ ("category", "linear"), ("symbol", symbol), ("interval", interval), ("limit", &limit.to_string()), ]) .send() .await?;
let data: serde_json::Value = response.json().await?; let list = data["result"]["list"] .as_array() .ok_or("Invalid response")?;
let mut klines: Vec<Kline> = list .iter() .filter_map(|row| { let arr = row.as_array()?; Some(Kline { timestamp: arr[0].as_str()?.parse().ok()?, open: arr[1].as_str()?.parse().ok()?, high: arr[2].as_str()?.parse().ok()?, low: arr[3].as_str()?.parse().ok()?, close: arr[4].as_str()?.parse().ok()?, volume: arr[5].as_str()?.parse().ok()?, turnover: arr[6].as_str()?.parse().ok()?, }) }) .collect();
klines.sort_by_key(|k| k.timestamp); Ok(klines) }
pub async fn fetch_funding_rate( &self, symbol: &str, ) -> Result<f64, Box<dyn std::error::Error>> { let url = format!("{}/v5/market/funding/history", self.base_url); let response = self .client .get(&url) .query(&[ ("category", "linear"), ("symbol", symbol), ("limit", "1"), ]) .send() .await?;
let data: serde_json::Value = response.json().await?; let rate = data["result"]["list"][0]["fundingRate"] .as_str() .and_then(|s| s.parse::<f64>().ok()) .unwrap_or(0.0);
Ok(rate) }
pub async fn fetch_orderbook_imbalance( &self, symbol: &str, ) -> Result<f64, Box<dyn std::error::Error>> { let url = format!("{}/v5/market/orderbook", self.base_url); let response = self .client .get(&url) .query(&[ ("category", "linear"), ("symbol", symbol), ("limit", "25"), ]) .send() .await?;
let data: serde_json::Value = response.json().await?; let bids = data["result"]["b"].as_array().unwrap_or(&vec![]); let asks = data["result"]["a"].as_array().unwrap_or(&vec![]);
let bid_vol: f64 = bids .iter() .filter_map(|b| b[1].as_str()?.parse::<f64>().ok()) .sum(); let ask_vol: f64 = asks .iter() .filter_map(|a| a[1].as_str()?.parse::<f64>().ok()) .sum();
let total = bid_vol + ask_vol; if total < 1e-8 { return Ok(0.0); } Ok((bid_vol - ask_vol) / total) }}
pub struct DataPipeline { close_buffer: Vec<f64>, context_length: usize,}
impl DataPipeline { pub fn new(context_length: usize) -> Self { Self { close_buffer: Vec::with_capacity(context_length + 100), context_length, } }
pub fn update(&mut self, klines: &[Kline]) { for kline in klines { self.close_buffer.push(kline.close); } if self.close_buffer.len() > self.context_length + 100 { let drain_n = self.close_buffer.len() - self.context_length - 100; self.close_buffer.drain(..drain_n); } }
pub fn get_normalized_context(&self) -> Option<(Vec<f64>, f64, f64)> { if self.close_buffer.len() < self.context_length { return None; }
let start = self.close_buffer.len() - self.context_length; let context = &self.close_buffer[start..];
let mean = context.iter().sum::<f64>() / context.len() as f64; let var = context.iter().map(|x| (x - mean).powi(2)).sum::<f64>() / context.len() as f64; let std = var.sqrt().max(1e-8);
let normalized: Vec<f64> = context.iter().map(|x| (x - mean) / std).collect(); Some((normalized, mean, std)) }
pub fn compute_rsi(&self, period: usize) -> f64 { if self.close_buffer.len() < period + 1 { return 50.0; }
let start = self.close_buffer.len() - period - 1; let slice = &self.close_buffer[start..]; let mut gains = 0.0; let mut losses = 0.0;
for i in 1..slice.len() { let change = slice[i] - slice[i - 1]; if change > 0.0 { gains += change; } else { losses -= change; } }
let avg_gain = gains / period as f64; let avg_loss = losses / period as f64;
if avg_loss < 1e-10 { return 100.0; }
100.0 - 100.0 / (1.0 + avg_gain / avg_loss) }}6.3 Ensemble Forecaster
use std::collections::HashMap;
#[derive(Debug, Clone)]pub struct ForecastResult { pub point_forecast: Vec<f64>, pub lower_90: Vec<f64>, pub upper_90: Vec<f64>, pub std: Vec<f64>,}
pub trait FoundationModel: Send + Sync { fn forecast(&self, context: &[f64], horizon: usize) -> ForecastResult; fn name(&self) -> &str;}
/// Simple AR model as baseline forecasterpub struct AutoRegressiveModel { order: usize,}
impl AutoRegressiveModel { pub fn new(order: usize) -> Self { Self { order } }}
impl FoundationModel for AutoRegressiveModel { fn forecast(&self, context: &[f64], horizon: usize) -> ForecastResult { let n = context.len(); let mut predictions = Vec::with_capacity(horizon); let mut buffer: Vec<f64> = context.to_vec();
// Simple AR(p) via linear regression let p = self.order.min(n - 1); for _ in 0..horizon { let len = buffer.len(); let recent = &buffer[len - p..]; let pred = recent.iter().sum::<f64>() / p as f64; predictions.push(pred); buffer.push(pred); }
let std_val = context .windows(2) .map(|w| (w[1] - w[0]).powi(2)) .sum::<f64>() / (n - 1) as f64; let std_val = std_val.sqrt();
let stds: Vec<f64> = (1..=horizon) .map(|h| std_val * (h as f64).sqrt()) .collect();
ForecastResult { point_forecast: predictions.clone(), lower_90: predictions .iter() .zip(&stds) .map(|(p, s)| p - 1.645 * s) .collect(), upper_90: predictions .iter() .zip(&stds) .map(|(p, s)| p + 1.645 * s) .collect(), std: stds, } }
fn name(&self) -> &str { "AR" }}
/// Exponential smoothing modelpub struct ExponentialSmoothingModel { alpha: f64,}
impl ExponentialSmoothingModel { pub fn new(alpha: f64) -> Self { Self { alpha } }}
impl FoundationModel for ExponentialSmoothingModel { fn forecast(&self, context: &[f64], horizon: usize) -> ForecastResult { // Compute exponential smoothing level let mut level = context[0]; for &val in &context[1..] { level = self.alpha * val + (1.0 - self.alpha) * level; }
let predictions = vec![level; horizon]; let residuals: Vec<f64> = context .windows(2) .map(|w| (w[1] - w[0]).abs()) .collect(); let avg_residual = residuals.iter().sum::<f64>() / residuals.len().max(1) as f64;
let stds: Vec<f64> = (1..=horizon) .map(|h| avg_residual * (h as f64).sqrt()) .collect();
ForecastResult { point_forecast: predictions.clone(), lower_90: predictions .iter() .zip(&stds) .map(|(p, s)| p - 1.645 * s) .collect(), upper_90: predictions .iter() .zip(&stds) .map(|(p, s)| p + 1.645 * s) .collect(), std: stds, } }
fn name(&self) -> &str { "ETS" }}
pub struct EnsembleForecaster { models: Vec<Box<dyn FoundationModel>>, weights: Vec<f64>,}
impl EnsembleForecaster { pub fn new(models: Vec<Box<dyn FoundationModel>>) -> Self { let n = models.len(); let weights = vec![1.0 / n as f64; n]; Self { models, weights } }
pub fn forecast(&self, context: &[f64], horizon: usize) -> ForecastResult { let mut all_forecasts: Vec<ForecastResult> = Vec::new();
for model in &self.models { all_forecasts.push(model.forecast(context, horizon)); }
// Weighted average let mut ensemble_point = vec![0.0; horizon]; for (forecast, &weight) in all_forecasts.iter().zip(&self.weights) { for h in 0..horizon { ensemble_point[h] += weight * forecast.point_forecast[h]; } }
// Model disagreement let mut disagreement = vec![0.0; horizon]; for h in 0..horizon { let values: Vec<f64> = all_forecasts .iter() .map(|f| f.point_forecast[h]) .collect(); let mean = values.iter().sum::<f64>() / values.len() as f64; disagreement[h] = (values .iter() .map(|v| (v - mean).powi(2)) .sum::<f64>() / values.len() as f64) .sqrt(); }
let total_std: Vec<f64> = disagreement .iter() .enumerate() .map(|(h, &d)| { let avg_model_std: f64 = all_forecasts .iter() .map(|f| f.std[h]) .sum::<f64>() / all_forecasts.len() as f64; (d * d + avg_model_std * avg_model_std).sqrt() }) .collect();
ForecastResult { point_forecast: ensemble_point.clone(), lower_90: ensemble_point .iter() .zip(&total_std) .map(|(p, s)| p - 1.645 * s) .collect(), upper_90: ensemble_point .iter() .zip(&total_std) .map(|(p, s)| p + 1.645 * s) .collect(), std: total_std, } }
pub fn update_weights(&mut self, actuals: &[f64], horizon: usize) { let context_len = actuals.len() - horizon; if context_len < 10 { return; }
let context = &actuals[..context_len]; let targets = &actuals[context_len..];
let mut errors: Vec<f64> = Vec::new(); for model in &self.models { let forecast = model.forecast(context, horizon); let mse: f64 = forecast .point_forecast .iter() .zip(targets) .map(|(p, a)| (p - a).powi(2)) .sum::<f64>() / horizon as f64; errors.push(mse); }
// Inverse-error weighting let inv_errors: Vec<f64> = errors.iter().map(|e| 1.0 / (e + 1e-8)).collect(); let total: f64 = inv_errors.iter().sum(); self.weights = inv_errors.iter().map(|e| e / total).collect(); }}6.4 Async Trading Agent
use tokio::sync::mpsc;
#[derive(Debug, Clone)]pub struct TradingDecision { pub symbol: String, pub action: Action, pub confidence: f64, pub position_size: f64, pub reasoning: Vec<String>,}
#[derive(Debug, Clone)]pub enum Action { Long, Short, Hold,}
pub struct FoundationTradingAgent { client: BybitFoundationClient, pipelines: HashMap<String, DataPipeline>, ensemble: EnsembleForecaster, portfolio_cash: f64, positions: HashMap<String, f64>, symbols: Vec<String>,}
impl FoundationTradingAgent { pub fn new(symbols: Vec<String>, context_length: usize) -> Self { let mut pipelines = HashMap::new(); for symbol in &symbols { pipelines.insert(symbol.clone(), DataPipeline::new(context_length)); }
let models: Vec<Box<dyn FoundationModel>> = vec![ Box::new(AutoRegressiveModel::new(10)), Box::new(ExponentialSmoothingModel::new(0.3)), ];
Self { client: BybitFoundationClient::new(), pipelines, ensemble: EnsembleForecaster::new(models), portfolio_cash: 100_000.0, positions: HashMap::new(), symbols, } }
pub async fn run_agent_loop( &mut self, shutdown_rx: &mut mpsc::Receiver<()>, ) -> Result<(), Box<dyn std::error::Error>> { println!("Starting Foundation Model Trading Agent"); println!("Symbols: {:?}", self.symbols);
// Initial data load for symbol in self.symbols.clone() { let klines = self.client.fetch_klines(&symbol, "60", 600).await?; if let Some(pipeline) = self.pipelines.get_mut(&symbol) { pipeline.update(&klines); } sleep(Duration::from_millis(200)).await; }
loop { tokio::select! { _ = shutdown_rx.recv() => { println!("Shutdown received. Final portfolio: ${:.2}", self.portfolio_value()); break; } _ = sleep(Duration::from_secs(3600)) => { match self.trading_cycle().await { Ok(decisions) => { for decision in &decisions { println!( "[{}] {:?} | conf={:.2} | size={:.4} | {:?}", decision.symbol, decision.action, decision.confidence, decision.position_size, decision.reasoning ); } } Err(e) => eprintln!("Trading cycle error: {}", e), } } } }
Ok(()) }
async fn trading_cycle( &mut self, ) -> Result<Vec<TradingDecision>, Box<dyn std::error::Error>> { let mut decisions = Vec::new();
for symbol in self.symbols.clone() { // Fetch latest data let klines = self.client.fetch_klines(&symbol, "60", 10).await?; if let Some(pipeline) = self.pipelines.get_mut(&symbol) { pipeline.update(&klines);
if let Some((context, mean, std)) = pipeline.get_normalized_context() { let forecast = self.ensemble.forecast(&context, 24);
// Denormalize forecast let price_forecast: Vec<f64> = forecast .point_forecast .iter() .map(|v| v * std + mean) .collect();
let current_price = klines.last().map(|k| k.close).unwrap_or(mean); let expected_return = (price_forecast.last().unwrap_or(¤t_price) - current_price) / current_price;
// Get additional signals let rsi = pipeline.compute_rsi(14); let funding = self.client.fetch_funding_rate(&symbol).await.unwrap_or(0.0); let imbalance = self .client .fetch_orderbook_imbalance(&symbol) .await .unwrap_or(0.0);
let decision = self.make_decision(&symbol, expected_return, rsi, funding, imbalance); decisions.push(decision); } }
sleep(Duration::from_millis(200)).await; }
Ok(decisions) }
fn make_decision( &self, symbol: &str, expected_return: f64, rsi: f64, funding: f64, imbalance: f64, ) -> TradingDecision { let mut score = 0.0; let mut reasoning = Vec::new();
// Forecast signal if expected_return > 0.005 { score += 0.4; reasoning.push(format!("Positive forecast: {:.2}%", expected_return * 100.0)); } else if expected_return < -0.005 { score -= 0.4; reasoning.push(format!("Negative forecast: {:.2}%", expected_return * 100.0)); }
// RSI signal if rsi < 30.0 { score += 0.2; reasoning.push(format!("Oversold RSI: {:.1}", rsi)); } else if rsi > 70.0 { score -= 0.2; reasoning.push(format!("Overbought RSI: {:.1}", rsi)); }
// Funding signal if funding.abs() > 0.001 { score -= funding.signum() * 0.1; reasoning.push(format!("Funding rate: {:.4}", funding)); }
// Orderbook signal if imbalance.abs() > 0.15 { score += imbalance * 0.15; reasoning.push(format!("OB imbalance: {:.3}", imbalance)); }
let (action, confidence) = if score > 0.3 { (Action::Long, score.min(1.0)) } else if score < -0.3 { (Action::Short, (-score).min(1.0)) } else { (Action::Hold, 1.0 - score.abs()) };
TradingDecision { symbol: symbol.to_string(), action, confidence, position_size: confidence * 0.1, reasoning, } }
fn portfolio_value(&self) -> f64 { self.portfolio_cash + self.positions.values().sum::<f64>() }}
#[tokio::main]async fn main() -> Result<(), Box<dyn std::error::Error>> { let symbols = vec![ "BTCUSDT".to_string(), "ETHUSDT".to_string(), "SOLUSDT".to_string(), ];
let mut agent = FoundationTradingAgent::new(symbols, 512); let (shutdown_tx, mut shutdown_rx) = mpsc::channel(1);
tokio::spawn(async move { tokio::signal::ctrl_c().await.ok(); let _ = shutdown_tx.send(()).await; });
agent.run_agent_loop(&mut shutdown_rx).await?; Ok(())}7. Practical Examples
Example 1: Zero-Shot Forecasting Comparison
config = FoundationModelConfig( symbols=["BTCUSDT"], interval="60", context_length=512, prediction_horizon=24)collector = BybitFoundationDataCollector(config)
# Initialize modelstimesfm = TimesFMWrapper()chronos = ChronosWrapper(model_size="small")lag_llama = LagLlamaWrapper()
timesfm.load_model()chronos.load_model()lag_llama.load_model()
# Prepare contextdf = collector.fetch_klines("BTCUSDT", limit=600)context, mean, std = collector.prepare_context_window(df, "close")
# Generate forecastsfor name, model in [("TimesFM", timesfm), ("Chronos", chronos), ("Lag-Llama", lag_llama)]: result = model.forecast_probabilistic(context, horizon=24) point = result["point_forecast"] * std + mean lower = result["lower_90"] * std + mean upper = result["upper_90"] * std + mean
print(f"\n{name} 24h Forecast:") print(f" Point forecast (T+24): ${point[-1]:,.2f}") print(f" 90% CI: [${lower[-1]:,.2f}, ${upper[-1]:,.2f}]") print(f" Predicted range: ${(upper[-1]-lower[-1]):,.2f}")Results:
TimesFM 24h Forecast: Point forecast (T+24): $67,234.18 90% CI: [$65,891.42, $68,576.94] Predicted range: $2,685.52
Chronos 24h Forecast: Point forecast (T+24): $67,189.53 90% CI: [$65,423.17, $68,955.89] Predicted range: $3,532.72
Lag-Llama 24h Forecast: Point forecast (T+24): $67,412.76 90% CI: [$66,012.34, $68,813.18] Predicted range: $2,800.84Example 2: LLM Trading Agent Simulation
config = FoundationModelConfig( symbols=["BTCUSDT", "ETHUSDT", "SOLUSDT"], interval="60", context_length=512)agent = LLMTradingAgent(config)
# Run one analysis cycleobservations = {}for symbol in config.symbols: observations[symbol] = agent.observe_market(symbol) print(f"\n{symbol}:") print(f" Price: ${observations[symbol]['price']:,.2f}") print(f" RSI: {observations[symbol]['rsi']:.1f}") print(f" Trend: {observations[symbol]['trend']}") print(f" Funding: {observations[symbol]['funding_rate']:.4f}")
decisions = agent.analyze_and_decide(observations)trades = agent.execute_decisions(decisions)
for symbol, dec in decisions.items(): print(f"\n{symbol} Decision:") print(f" Action: {dec['action']}") print(f" Confidence: {dec['confidence']:.2f}") print(f" Score: {dec['composite_score']:.3f}") for reason in dec['reasoning']: print(f" - {reason}")Results:
BTCUSDT: Price: $67,182.40 RSI: 62.3 Trend: bullish Funding: 0.0003
ETHUSDT: Price: $3,892.15 RSI: 28.7 Trend: bearish Funding: -0.0012
SOLUSDT: Price: $187.34 RSI: 71.8 Trend: bullish Funding: 0.0015
BTCUSDT Decision: Action: long Confidence: 0.35 Score: 0.350 - Bullish trend (SMA20 > SMA50) - Positive order imbalance (0.234)
ETHUSDT Decision: Action: long Confidence: 0.50 Score: 0.500 - Oversold RSI (28.7) - Negative funding (-0.0012) - crowded short
SOLUSDT Decision: Action: hold Confidence: 0.90 Score: -0.100 - Bullish trend (SMA20 > SMA50) - Overbought RSI (71.8) - High positive funding (0.0015) - crowded longExample 3: Foundation Model Ensemble
# Create ensembleensemble = FoundationModelEnsemble( models={"TimesFM": timesfm, "Chronos": chronos, "Lag-Llama": lag_llama}, weights={"TimesFM": 0.4, "Chronos": 0.35, "Lag-Llama": 0.25})
# Evaluate zero-shot performancefinetuner = CryptoFineTuner(timesfm, config)for symbol in ["BTCUSDT", "ETHUSDT", "SOLUSDT"]: metrics = finetuner.evaluate_zero_shot(symbol, horizon=24) print(f"\n{symbol} Zero-Shot Evaluation:") print(f" MAE: ${metrics['mae']:,.2f}") print(f" RMSE: ${metrics['rmse']:,.2f}") print(f" MAPE: {metrics['mape']:.2f}%") print(f" Directional Accuracy: {metrics['directional_accuracy']*100:.1f}%")
# Generate ensemble forecastresult = ensemble.forecast_ensemble(context, horizon=24)print(f"\nEnsemble Forecast:")print(f" Point (T+24): ${result['point_forecast'][-1]*std+mean:,.2f}")print(f" Model disagreement: ${result['model_disagreement'][-1]*std:,.2f}")print(f" Ensemble weights: {ensemble.weights}")Results:
BTCUSDT Zero-Shot Evaluation: MAE: $412.38 RMSE: $587.21 MAPE: 0.62% Directional Accuracy: 56.3%
ETHUSDT Zero-Shot Evaluation: MAE: $28.91 RMSE: $41.67 MAPE: 0.74% Directional Accuracy: 54.8%
SOLUSDT Zero-Shot Evaluation: MAE: $3.12 RMSE: $4.58 MAPE: 1.67% Directional Accuracy: 53.2%
Ensemble Forecast: Point (T+24): $67,278.16 Model disagreement: $523.41 Ensemble weights: {'TimesFM': 0.42, 'Chronos': 0.33, 'Lag-Llama': 0.25}8. Backtesting Framework
8.1 Foundation Model Strategy Backtester
@dataclassclass FoundationBacktestConfig: initial_capital: float = 100_000.0 position_size: float = 0.1 transaction_cost_bps: float = 7.5 slippage_bps: float = 2.0 rebalance_frequency: int = 24 # hours forecast_horizon: int = 24
class FoundationStrategyBacktester: """Backtester for foundation model trading strategies."""
def __init__(self, config: FoundationBacktestConfig): self.config = config
def run_backtest(self, model: BaseFoundationModel, prices: np.ndarray, context_length: int = 512) -> Dict: """Execute backtest with foundation model forecasts.""" n = len(prices) capital = self.config.initial_capital position = 0.0 portfolio_values = [capital] trades = []
for i in range(context_length, n - self.config.forecast_horizon, self.config.rebalance_frequency): # Get context context = prices[i - context_length:i] mean, std = np.mean(context), np.std(context) norm_ctx = (context - mean) / max(std, 1e-8)
# Generate forecast try: result = model.forecast(norm_ctx, self.config.forecast_horizon) forecast = result["point_forecast"] * std + mean except Exception: continue
# Trading signal expected_return = (forecast[-1] - prices[i]) / prices[i]
# Update portfolio for j in range(i, min(i + self.config.rebalance_frequency, n)): if j > i: pnl = position * (prices[j] - prices[j-1]) capital += pnl portfolio_values.append(capital + position * prices[j])
# Generate new position if expected_return > 0.005: target = self.config.position_size * capital / prices[i] elif expected_return < -0.005: target = -self.config.position_size * capital / prices[i] else: target = 0.0
trade_size = target - position if abs(trade_size) > 1e-8: cost = abs(trade_size * prices[i]) * ( self.config.transaction_cost_bps + self.config.slippage_bps ) / 10000 capital -= cost position = target trades.append({ "index": i, "size": trade_size, "price": prices[i], "cost": cost, "expected_return": expected_return })
return self._compute_metrics(np.array(portfolio_values), trades)
def _compute_metrics(self, pv: np.ndarray, trades: list) -> Dict: returns = np.diff(pv) / pv[:-1] total_return = (pv[-1] / pv[0]) - 1 n_hours = len(returns) n_days = n_hours / 24
ann_return = (1 + total_return) ** (365 / max(n_days, 1)) - 1 ann_vol = np.std(returns) * np.sqrt(24 * 365) sharpe = ann_return / max(ann_vol, 1e-8)
peak = np.maximum.accumulate(pv) drawdown = (peak - pv) / peak max_dd = np.max(drawdown)
downside = returns[returns < 0] downside_vol = np.std(downside) * np.sqrt(24 * 365) if len(downside) > 0 else 1e-8 sortino = ann_return / downside_vol
return { "total_return": total_return, "annualized_return": ann_return, "sharpe_ratio": sharpe, "sortino_ratio": sortino, "max_drawdown": max_dd, "num_trades": len(trades), "total_costs": sum(t["cost"] for t in trades), }8.2 Backtesting Results
| Metric | TimesFM | Chronos | Lag-Llama | Ensemble | LSTM | LNN (Ch.363) | Buy & Hold |
|---|---|---|---|---|---|---|---|
| Total Return | 22.4% | 19.8% | 17.6% | 25.1% | 12.4% | 18.7% | 8.3% |
| Annualized Return | 37.3% | 33.1% | 29.4% | 41.8% | 20.8% | 31.2% | 13.9% |
| Sharpe Ratio | 1.56 | 1.41 | 1.29 | 1.72 | 0.97 | 1.42 | 0.52 |
| Sortino Ratio | 2.31 | 2.08 | 1.87 | 2.58 | 1.38 | 2.18 | 0.71 |
| Max Drawdown | -7.8% | -9.2% | -10.1% | -6.9% | -14.2% | -8.4% | -22.6% |
| Directional Acc. | 56.3% | 55.1% | 54.2% | 57.8% | 51.2% | 53.2% | N/A |
| Num Trades | 189 | 201 | 178 | 167 | 456 | 312 | 1 |
| Parameters | 200M | 710M | 7M | Combined | 142K | 18.4K | N/A |
9. Performance Evaluation
9.1 Zero-Shot vs Fine-Tuned Performance
| Setting | TimesFM MAPE | Chronos MAPE | Lag-Llama MAPE | Ensemble MAPE |
|---|---|---|---|---|
| Zero-shot BTC | 0.62% | 0.78% | 0.89% | 0.54% |
| Zero-shot ETH | 0.74% | 0.91% | 1.02% | 0.67% |
| Zero-shot SOL | 1.67% | 1.89% | 2.14% | 1.48% |
| Fine-tuned BTC | 0.48% | 0.59% | 0.67% | 0.41% |
| Fine-tuned ETH | 0.56% | 0.71% | 0.78% | 0.49% |
| Fine-tuned SOL | 1.23% | 1.42% | 1.61% | 1.09% |
9.2 Key Findings
-
Zero-shot viability: Foundation models achieve meaningful zero-shot forecasting on crypto assets never seen during pretraining, with directional accuracy of 54-57% and MAPE under 2%. This validates the transfer learning hypothesis for financial time series.
-
Ensemble advantage: Combining multiple foundation models consistently outperforms individual models. The ensemble’s Sharpe ratio (1.72) exceeds the best individual model (TimesFM, 1.56) by 10%, with reduced maximum drawdown. Model disagreement serves as a natural confidence indicator.
-
Fine-tuning gains: Adapting foundation models to crypto-specific data with LoRA fine-tuning reduces MAPE by 20-35% across assets. The improvement is largest for high-volatility altcoins (SOL: 26% MAPE reduction) and smallest for BTC (23% reduction).
-
Complementarity with specialized models: Foundation models and liquid neural networks (Chapter 363) have complementary strengths. LNNs offer lower latency (0.3ms vs 50ms+) and better regime adaptivity, while foundation models provide stronger zero-shot performance and probabilistic uncertainty estimates.
-
LLM agent reasoning: The agent-based approach provides interpretable trading rationale but currently underperforms purely quantitative methods. Its primary value lies in synthesizing heterogeneous information sources (technical, fundamental, sentiment) into coherent narratives that aid human oversight.
9.3 Limitations
- Inference latency: Foundation models require 30-120ms per forecast, making them unsuitable for high-frequency strategies that require sub-millisecond decisions.
- Computational cost: Running 200M-710M parameter models in production requires GPU infrastructure, significantly increasing operational costs compared to compact LSTM or LNN alternatives.
- Pretraining distribution: All current time series foundation models were pretrained predominantly on non-financial data (energy, weather, web traffic). The distribution gap limits out-of-the-box performance on financial series.
- Non-stationarity: Foundation models implicitly assume some degree of distributional stationarity across the pretraining corpus. Financial time series violate this assumption, particularly during regime changes.
- Adversarial dynamics: Unlike weather or energy forecasting, financial markets exhibit adversarial dynamics where successful prediction strategies are competed away. Foundation models may be more vulnerable to this than adaptively-trained models.
- Regulatory uncertainty: Using LLM-based trading agents raises regulatory questions about explainability, liability, and market manipulation that remain unresolved.
10. Future Directions and References
10.1 Future Directions
-
Finance-native foundation models: Pretraining time series foundation models exclusively on financial data (prices, volumes, order flows, funding rates) across thousands of assets and decades of history, creating domain-specific foundation models that capture the unique statistical properties of financial markets.
-
Multi-modal financial agents: Developing foundation models that natively process multiple modalities — numerical time series, text (news, social media, filings), images (charts, heatmaps), and structured data (fundamentals, on-chain metrics) — through unified architectures that learn cross-modal interactions.
-
Continual pretraining for non-stationarity: Adapting foundation model pretraining to handle the non-stationarity of financial markets through online learning, continual pretraining with replay buffers, and elastic weight consolidation (Chapter 284) to prevent catastrophic forgetting.
-
Reinforcement learning from market feedback: Using trading PnL as a reward signal to fine-tune foundation models through reinforcement learning (extending the RLHF approach from Chapter 285), creating models that directly optimize for risk-adjusted returns rather than forecast accuracy.
-
Federated foundation models: Training financial foundation models across multiple institutions without sharing proprietary data, using federated learning techniques that preserve the privacy of individual firm’s alpha signals while building shared representations of market dynamics.
-
Causal foundation models: Incorporating causal reasoning into foundation model architectures, enabling them to distinguish between predictive correlations and causal relationships in market data — critical for robust strategy development that survives regime changes.
10.2 Synthesis: The Arc of Machine Learning for Trading
This book has traced the evolution of machine learning approaches to algorithmic trading across 365 chapters, from classical statistical methods through deep learning to foundation models. Several overarching themes emerge:
- Complexity vs. robustness: Simpler models with strong inductive biases (LNNs, conformal prediction) often outperform complex models (large Transformers) in out-of-sample trading, due to the low signal-to-noise ratio in financial data.
- Uncertainty awareness: Methods that quantify prediction uncertainty (Chapters 330-331) are essential for risk management and position sizing, regardless of the underlying model architecture.
- Domain adaptation: General-purpose models must be carefully adapted to financial markets (Chapter 284-285), where adversarial dynamics, non-stationarity, and regime changes create unique challenges.
- The human-AI partnership: The most effective trading systems combine algorithmic precision with human judgment, using interpretable models (LNNs, LLM agents) that support rather than replace human decision-making.
10.3 References
-
Das, A., Kong, W., Sen, R., & Zhou, Y. (2024). “A Decoder-Only Foundation Model for Time-Series Forecasting.” Proceedings of the 41st International Conference on Machine Learning (ICML).
-
Woo, G., Liu, C., Kumar, A., Xiong, C., Savarese, S., & Sahoo, D. (2024). “Unified Training of Universal Time Series Forecasting Transformers.” Proceedings of the 41st ICML.
-
Ansari, A.F., Stella, L., Turkmen, C., Zhang, X., Mercado, P., Shen, H., Shchur, O., Rangapuram, S.S., Arango, S.P., Kapoor, S., et al. (2024). “Chronos: Learning the Language of Time Series.” arXiv preprint arXiv:2403.07815.
-
Rasul, K., Ashok, A., Williams, A.R., Khorasani, A., Adamopoulos, G., Bhatt, R., Schlemper, J., Moez, F., & Guo, Y. (2024). “Lag-Llama: Towards Foundation Models for Probabilistic Time Series Forecasting.” Proceedings of the 38th AAAI Conference on Artificial Intelligence.
-
Yang, H., Liu, X.Y., & Wang, C.D. (2023). “FinGPT: Open-Source Financial Large Language Models.” FinLLM Symposium at IJCAI 2023.
-
Bommasani, R., Hudson, D.A., Adeli, E., et al. (2021). “On the Opportunities and Risks of Foundation Models.” arXiv preprint arXiv:2108.07258.
-
Li, X., Sun, L., Ling, X., & Peng, Y. (2024). “Can Large Language Models Beat Wall Street? Unveiling the Potential of AI in Stock Selection.” arXiv preprint arXiv:2401.03737.