Chapter 254: Text Summarization Finance

Introduction

Text summarization in finance is the task of automatically condensing lengthy financial documents into concise, information-dense summaries that capture the most critical facts and insights. Financial markets generate an enormous volume of textual data every day: quarterly earnings reports (10-K, 10-Q), analyst notes, central bank minutes, news articles, and regulatory filings. A single 10-K filing can exceed 100 pages, making it impractical for portfolio managers and traders to read every document in full. Automatic summarization bridges this gap by extracting or generating the key takeaways in seconds.

For algorithmic traders, text summarization serves as a critical upstream component in NLP-driven trading pipelines. Rather than feeding raw, verbose documents into sentiment classifiers or signal generators, summarization first distills the text to its essential content. This reduces noise, lowers computational cost, and improves the signal-to-noise ratio of downstream models. A well-crafted summary of an earnings call can reveal whether management is optimistic or cautious, whether revenue beat expectations, and whether guidance was raised or lowered, all in a few sentences.

This chapter presents a complete framework for financial text summarization. We cover both extractive and abstractive approaches, the mathematical foundations behind sentence scoring and transformer-based generation, and a working Rust implementation that connects to the Bybit cryptocurrency exchange to demonstrate how summarization-derived sentiment can drive trading signals.

Key Concepts

Extractive Summarization

Extractive summarization selects the most important sentences from the original document and concatenates them to form a summary. No new text is generated; the summary is a subset of the original sentences. This approach is attractive because it preserves factual accuracy and avoids hallucination.

Given a document $D = {s_1, s_2, \ldots, s_n}$ consisting of $n$ sentences, extractive summarization assigns a relevance score $r(s_i)$ to each sentence and selects the top-$k$ sentences:

$$S^* = \underset{S \subseteq D, |S| = k}{\arg\max} \sum_{s_i \in S} r(s_i)$$

The scoring function $r(s_i)$ can incorporate multiple features: position in the document, TF-IDF weight, overlap with the title, presence of named entities, and sentence length. A weighted combination produces the final score:

$$r(s_i) = \sum_{j=1}^{m} w_j \cdot f_j(s_i)$$

where $f_j$ are feature functions and $w_j$ are learned or hand-tuned weights.

Abstractive Summarization

Abstractive summarization generates new text that captures the meaning of the original document, potentially using words and phrases not present in the source. This mirrors how a human analyst would write a summary: reading the document, understanding the content, and paraphrasing the key points.

Modern abstractive summarization relies on encoder-decoder architectures with attention. Given an input sequence $\mathbf{x} = (x_1, \ldots, x_n)$, the encoder produces hidden states $\mathbf{h} = (h_1, \ldots, h_n)$. The decoder generates the summary token by token, attending to the encoder states:

$$\alpha_{t,i} = \frac{\exp(e_{t,i})}{\sum_{j=1}^{n} \exp(e_{t,j})}$$

$$\mathbf{c}t = \sum{i=1}^{n} \alpha_{t,i} \mathbf{h}_i$$

where $e_{t,i} = \mathbf{v}^T \tanh(\mathbf{W}_h \mathbf{h}_i + \mathbf{W}_s \mathbf{s}_t + \mathbf{b})$ is the attention energy, $\mathbf{s}_t$ is the decoder state, and $\mathbf{c}_t$ is the context vector used to predict the next output token.

Financial Document Structure

Financial documents have a distinctive structure that can be exploited for better summarization:

10-K/10-Q filings: Contain standardized sections (Risk Factors, MD&A, Financial Statements). The Management Discussion and Analysis section is typically the most information-dense for trading signals.
Earnings call transcripts: Feature a prepared remarks section followed by Q&A. The Q&A section often contains forward-looking statements that move markets.
Analyst reports: Include a recommendation, price target, and detailed analysis. The executive summary and recommendation changes are the most actionable sections.
Central bank minutes: Use carefully calibrated language where subtle word changes signal policy shifts.

Understanding this structure allows summarization systems to weight sections appropriately rather than treating the document as a flat sequence of sentences.

ROUGE Metrics

Recall-Oriented Understudy for Gisting Evaluation (ROUGE) is the standard family of metrics for evaluating summarization quality. ROUGE measures the overlap between a generated summary and one or more reference summaries.

ROUGE-1 measures unigram overlap:

$$\text{ROUGE-1} = \frac{\sum_{s \in \text{Ref}} \sum_{w \in s} \min(\text{count}{\text{gen}}(w), \text{count}{\text{ref}}(w))}{\sum_{s \in \text{Ref}} \sum_{w \in s} \text{count}_{\text{ref}}(w)}$$

ROUGE-2 measures bigram overlap, capturing fluency and word ordering:

$$\text{ROUGE-2} = \frac{\sum_{s \in \text{Ref}} \sum_{b \in s} \min(\text{count}{\text{gen}}(b), \text{count}{\text{ref}}(b))}{\sum_{s \in \text{Ref}} \sum_{b \in s} \text{count}_{\text{ref}}(b)}$$

ROUGE-L measures the longest common subsequence (LCS):

$$\text{ROUGE-L} = \frac{(1 + \beta^2) R_{lcs} P_{lcs}}{R_{lcs} + \beta^2 P_{lcs}}$$

where $R_{lcs} = \text{LCS}(X, Y) / |Y|$ and $P_{lcs} = \text{LCS}(X, Y) / |X|$.

ML Approaches

TF-IDF Based Extraction

Term Frequency-Inverse Document Frequency weights words by their importance within a document relative to a corpus. For a word $w$ in document $d$ from corpus $C$:

$$\text{TF}(w, d) = \frac{f_{w,d}}{\sum_{w’ \in d} f_{w’,d}}$$

$$\text{IDF}(w, C) = \log \frac{|C|}{1 + |{d \in C : w \in d}|}$$

$$\text{TF-IDF}(w, d, C) = \text{TF}(w, d) \cdot \text{IDF}(w, C)$$

For sentence scoring, the TF-IDF score of a sentence is the average TF-IDF of its constituent words:

$$\text{score}{tfidf}(s) = \frac{1}{|s|} \sum{w \in s} \text{TF-IDF}(w, d, C)$$

Sentences with high TF-IDF scores contain words that are important to the document but rare across the corpus, making them strong candidates for inclusion in the summary.

TextRank for Sentence Selection

TextRank adapts the PageRank algorithm to sentence extraction. Sentences are nodes in a graph, and edge weights represent similarity between sentences:

$$\text{sim}(s_i, s_j) = \frac{|{w : w \in s_i \cap s_j}|}{\log|s_i| + \log|s_j|}$$

The TextRank score of each sentence is computed iteratively:

$$\text{TR}(s_i) = (1 - d) + d \sum_{s_j \in \text{adj}(s_i)} \frac{\text{sim}(s_i, s_j)}{\sum_{s_k \in \text{adj}(s_j)} \text{sim}(s_j, s_k)} \text{TR}(s_j)$$

where $d = 0.85$ is the damping factor. After convergence, the highest-scoring sentences are selected for the summary.

Transformer-Based Abstractive Summarization

Pre-trained transformer models such as BART, T5, and Pegasus have set new benchmarks for abstractive summarization. These models use the encoder-decoder architecture with multi-head self-attention:

$$\text{Attention}(Q, K, V) = \text{softmax}\left(\frac{QK^T}{\sqrt{d_k}}\right) V$$

$$\text{MultiHead}(Q, K, V) = \text{Concat}(\text{head}_1, \ldots, \text{head}_h) W^O$$

where $\text{head}_i = \text{Attention}(QW_i^Q, KW_i^K, VW_i^V)$. Fine-tuning these models on financial text (e.g., using the FNS 2021 dataset or SEC filings) produces summaries that capture domain-specific terminology and nuances.

Feature Engineering

Sentence-Level Features

Effective extractive summarization relies on well-designed sentence features:

Position score: Sentences early in a document or paragraph tend to be more important. For sentence at position $i$ in a document of $n$ sentences: $f_{pos}(s_i) = 1 - i/n$.
Length score: Very short or very long sentences are penalized. An ideal range captures complete thoughts without excessive detail: $f_{len}(s) = \min(|s|, L_{max}) / L_{max}$ where $L_{max}$ is a threshold.
TF-IDF score: Average TF-IDF of words in the sentence, as defined above.
Named entity density: Sentences with more named entities (companies, people, locations) tend to be more informative: $f_{NE}(s) = |\text{NE}(s)| / |s|$.

Document-Level Features

Beyond individual sentences, document-level context matters:

Section importance: In a 10-K, the MD&A section has higher weight than boilerplate risk disclosures. Section weights can be learned from data or set by domain experts.
Document type: Earnings calls require different treatment than regulatory filings. The summarization model can adapt its feature weights based on document type.
Temporal references: Sentences containing forward-looking language (“we expect”, “guidance for next quarter”) are often more market-moving than backward-looking statements.

Financial-Specific Features

Finance-domain summarization benefits from specialized features:

Financial entity recognition: Identifying mentions of revenue, EPS, EBITDA, and other key metrics. Sentences containing these entities are high-priority candidates.
Sentiment in sentences: Individual sentence sentiment provides a gradient of positivity/negativity. Strongly positive or negative sentences are more likely to be summary-worthy.
Numerical density: In financial documents, sentences with more numbers (dollar amounts, percentages, ratios) tend to carry concrete, actionable information: $f_{num}(s) = |\text{numbers}(s)| / |s|$.

Applications

Earnings Report Summarization

Quarterly earnings reports (10-K and 10-Q filings) are the most common target for financial summarization. A 10-K can span 80-150 pages, but the market-moving information is concentrated in a few key sections. An automatic summarizer can:

Extract the most important sentences from the MD&A section.
Highlight changes in revenue, earnings, and guidance.
Identify risk factors that have been added or removed since the previous filing.
Generate a 5-10 sentence summary that a portfolio manager can read in under a minute.

News Summarization for Trading

Financial news feeds produce hundreds of articles per hour. For algorithmic trading, real-time summarization enables:

Speed: Summarize breaking news in milliseconds, faster than human reading.
Scale: Process thousands of articles simultaneously across multiple asset classes.
Signal extraction: Distill each article to its sentiment and key facts, then aggregate across sources to produce a trading signal.

The summary sentiment is computed and mapped to a trading action: strongly positive summaries trigger buy signals, strongly negative summaries trigger sell signals, and neutral summaries suggest holding.

Analyst Report Condensation

Sell-side analyst reports often span 20-50 pages with detailed industry analysis. Portfolio managers need the key takeaways: recommendation changes, price target updates, and catalysts. Summarization systems can extract these elements automatically, enabling fund managers to track hundreds of analysts’ views without reading every report.

Rust Implementation

Our Rust implementation provides a complete text summarization toolkit with the following components:

TfIdfVectorizer

The TfIdfVectorizer struct computes term frequency-inverse document frequency scores for words across a corpus of documents. It builds a vocabulary from the corpus, computes term frequencies per document, and calculates IDF values across the entire corpus. The vectorizer exposes methods for scoring individual sentences based on their average TF-IDF weight.

SentenceScorer

The SentenceScorer struct combines multiple features to rank sentences for extractive summarization. It computes position scores (earlier sentences rank higher), TF-IDF scores, length scores (penalizing too-short and too-long sentences), financial keyword density (detecting mentions of revenue, profit, growth, loss), and numerical density (sentences with more numbers are prioritized in financial contexts).

TextSummarizer

The TextSummarizer struct performs extractive summarization by splitting a document into sentences, scoring each sentence using SentenceScorer, and selecting the top-$k$ sentences ordered by their original position in the document.

SentimentScorer

The SentimentScorer provides simple lexicon-based sentiment analysis using lists of positive and negative financial words. It computes a sentiment score in the range $[-1, 1]$ for each sentence or summary.

SummaryTrader

The SummaryTrader struct generates trading signals from summary sentiment. It maps sentiment scores to buy, sell, or hold actions using configurable thresholds.

BybitClient

The BybitClient struct provides async HTTP access to the Bybit V5 API. It fetches kline (candlestick) data from the /v5/market/kline endpoint. The client handles response parsing, error handling, and rate limiting considerations.

Bybit API Integration

The implementation connects to Bybit’s V5 REST API to obtain real-time market data:

Kline endpoint (/v5/market/kline): Provides OHLCV candlestick data at configurable intervals. Used for combining price data with summarization-derived sentiment signals.

The Bybit API is well-suited for this application because it provides:

Fine-grained intervals (1-minute klines for high-frequency analysis)
Consistent, low-latency responses suitable for real-time trading systems
Free access without API key requirements for public market data

References

El-Haj, M., Rayson, P., Walker, M., Young, S., & Sherring, V. (2020). Financial report summarization with pre-trained models. arXiv preprint arXiv:2011.06956.
Mihalcea, R., & Tarau, P. (2004). TextRank: Bringing order into texts. Proceedings of EMNLP, 404-411.
Lewis, M., Liu, Y., Goyal, N., Ghazvininejad, M., Mohamed, A., Levy, O., … & Zettlemoyer, L. (2020). BART: Denoising sequence-to-sequence pre-training for natural language generation, translation, and comprehension. ACL 2020, 7871-7880.
Lin, C. Y. (2004). ROUGE: A package for automatic evaluation of summaries. Text Summarization Branches Out, 74-81.
Moradi, M., Dorffner, G., & Samwald, M. (2020). Deep contextualized embeddings for quantifying the informative content in biomedical text summarization. Computer Methods and Programs in Biomedicine, 184, 105117.