Sentiment as Signal: Forecasting with Alternative Data and Generative AI

Quantitative trading based on market sentiment is a less developed area compared to traditional approaches. With the explosion of social media, advances in computing resources, and AI technology, sentiment-based trading is making progress. In this post, I will explore some aspects of sentiment trading.

Using ChatGPT to Extract Market Sentiment for Commodity Trading

A Large Language Model (LLM) is an advanced AI system trained on vast amounts of text data to understand, generate, and analyze human language. In finance, LLMs are used for tasks like analyzing earnings reports, generating market sentiment analysis, automating financial research, and enhancing algorithmic trading strategies.

Reference [1] examines the effectiveness of ChatGPT in predicting commodity returns. Specifically, it extracts commodity news information and forecasts commodity futures returns. The study gathers over 2.5 million articles related to the commodity market from nine international newspapers across three countries, covering a diverse set of 18 commodities.

Findings

-A novel Commodity News Ratio Index (CNRI) was developed using ChatGPT, derived from the analysis of more than 2.5 million news articles from nine international newspapers across 18 commodities.

-The CNRI effectively forecasts commodity futures excess returns over 1- to 12-month periods, demonstrating significant predictive power in both in-sample and out-of-sample regression analyses.

-ChatGPT was used to classify sentiment in commodity-related news as either positive or negative, based on headlines, abstracts, or full article content.

-The CNRI shows stronger forecasting accuracy during specific macroeconomic conditions—particularly economic expansions, contango market phases, and periods of declining inflation.

-This ChatGPT-based approach outperforms traditional text analysis methods, including BERT and Bag-of-Words, in predicting future returns in commodity markets.

-The study controlled for various business variables and economic indicators, confirming the independent predictive significance of the CNRI.

-Results indicate that the CNRI also holds macroeconomic insight, offering valuable signals on broader economic performance beyond commodity markets.

-Findings affirm the utility of ChatGPT in financial forecasting, showcasing the broader potential of LLMs in understanding and extracting actionable intelligence from complex financial text data.

-This research highlights the growing role of AI in finance, illustrating how LLMs can enhance decision-making for investors, analysts, and risk managers alike.

In short, ChatGPT proves useful in forecasting commodity market dynamics and provides valuable insights for investors and risk managers.

Reference

[1]Shen Gao, Shijie Wang, Yuanzhi Wang, Qunzi Zhang, ChatGPT and Commodity Return, Journal of Futures Market, 2025; 1–15

Using the Number of Confirmed Covid Cases as a Sentiment Indicator

COVID-19, the novel coronavirus, was a source of anxiety for markets and individuals around the world since its outbreak in December 2019. Many traders looked for ways to use the information on the spread of the virus to predict market movements.

In Reference [2], the authors established an intraday algorithmic trading system that would open a short position in the Eurostoxx 50 futures market if the number of new confirmed cases of Covid-19 increased in the previous day (suggesting that fear of the epidemic rises), and close by afternoon. The system will open a long position if the new confirmed cases of Covid-19 have decreased from the previous day. The trading system achieved an annual return of 423% and a Sharpe ratio of 4.74.

Findings

-Daily confirmed COVID-19 cases were used as a sentiment proxy, reflecting public fear and uncertainty in financial markets during the pandemic.

-Researchers built an intraday trading system for Eurostoxx 50 futures, responding to increases or decreases in new Covid-19 cases reported the previous day.

-The system opened short positions after rising case counts and long positions after declines, closing trades by the afternoon to reduce overnight exposure.

-This simple rule-based strategy delivered an annual return of 423% and a Sharpe ratio of 4.74, suggesting strong performance under extreme market stress.

-The study demonstrated that pandemic-related health data could serve as a reliable short-term predictor of market direction, especially during crisis periods.

-Results reinforce the idea that emotional triggers—like health fears—can impact trading behavior just as much as traditional economic indicators or financial models.

-During high-uncertainty environments, metrics that reflect collective anxiety, such as COVID-19 cases, can outperform classic sentiment tools like the VIX index.

-The strategy showed how non-financial data can be directly translated into market actions, offering practical tools for risk-aware investors and quant traders.

-Overall, the research contributes to behavioral finance by quantifying the influence of fear on asset prices in moments of extreme public concern.

The article presented new evidence that emotions have an impact on financial markets, especially in situations of extreme uncertainty. In these situations, investors may utilize a variety of investment techniques based on metrics reflecting the progression of fear.

Reference

[2] Gómez Martínez, R., Prado Román, C., &Cachón Rodríguez , G. (2021). Algorithmic trading based on the fear of Covid-19 in Europe, Harvard Deusto Business Research 10(2), 295-304.

Closing Thoughts

Together, these studies highlight the growing role of alternative data and AI-driven sentiment analysis in financial forecasting. From pandemic case counts to millions of news articles, both fear and information flow can shape markets in measurable ways. Whether through rule-based trading or LLM-powered indices, the findings underscore how emotion, uncertainty, and unstructured data are becoming key inputs in modern investment strategies.

Behavioral Biases and Retail Options Trading

Behavioral finance is important, but it’s not often discussed in quantitative trading. In this post, I explore some aspects of behavioral finance.

Why Do Investors Lose Money?

Behavioral finance is the study of how financial behavior affects economic decisions and market outcomes, and how those decisions and outcomes are affected by psychological, social, and cultural factors.

Behavioral finance research has shown that people do not always make rational decisions when it comes to money. Factors such as emotion, social pressure, and cognitive biases can all lead to suboptimal decisions. Reference [1] lists mistakes made by investors:

Findings

-Investors often fail to diversify adequately, exposing themselves to unnecessary idiosyncratic risk, which results in lower overall returns that could be avoided with simple diversification strategies.

-Many investors underperform the mutual funds they invest in due to poor timing decisions, such as buying high and selling low, which diminishes the benefits of professional fund management.

-The disposition effect leads investors to sell winning investments too early while holding onto losing ones for too long, negatively impacting portfolio performance over time.

-Investors who pay insufficient attention to markets or their portfolios tend to earn lower returns compared to more engaged and informed peers.

-Investment behavior is often reactive; individuals increase market exposure following strong returns and reduce it after losses, leading to suboptimal timing and missed opportunities.

-Home bias is prevalent among investors, who prefer local stocks despite lacking superior information about them, resulting in poor diversification and reduced portfolio efficiency.

-Overconfidence causes investors to trade excessively, and data shows that high-frequency traders typically earn worse returns than those who trade less frequently.

-Herd behavior is common, with investors often buying or selling the same stocks simultaneously, which amplifies market inefficiencies and can harm returns.

-Many investors chase past performance, moving their money into funds that have recently performed well, often too late to benefit from continued outperformance.

-Despite the availability of lower-cost options, investors frequently allocate funds to expensive products, ignoring predictable performance characteristics and reducing overall investment efficiency.

In summary, the article is a good primer on behavioral finance. It discusses, in particular, the investment mistakes that cause investors to lose money.

Reference

[1] Firth, Chris, An Introduction to Investment Mistakes (2015). SSRN 2609989

Retail Options Traders’ Behavior

Retail investors are individual, non-professional investors who buy and sell securities, such as stocks, options, and mutual funds, for their accounts rather than for an organization or institution. Unlike institutional investors, who manage large sums of money on behalf of clients or large entities, retail investors typically trade in smaller quantities and often use online brokerage accounts to facilitate their transactions.

A considerable amount of research has been devoted to studying retail investors’ behavior. A recent paper by the CBOE [2] utilizes the exchange’s data and refutes some academic research findings.

Findings

-Retail investor participation in the options market increased notably from 18% to 31% between the fourth quarter of 2019 and the fourth quarter of 2023.

-Complex orders made up 58% to 76% of retail open positions, challenging the belief that retail traders primarily hold simple long positions.

-Academic studies often miss complex retail trading activity due to reliance on limited datasets or assumptions that overlook retail investor sophistication.

-Retail traders show a wider range of strategies than previously thought, including multi-leg options trades and hedging techniques, indicating greater versatility.

-The study found that the assumption that retail investors lack sophistication is outdated, as many use advanced tools and approaches for managing risk.

-Market maker order imbalance in SPX options declined from -14% in December 2016 to -12% in May 2023, even with increased use of 0DTE options.

-This decline in imbalance suggests the growth of 0DTE SPX options has not disrupted market maker order flow, contrary to popular belief.

-When SPX options are excluded, retail trading still represented 32% to 40% of all non-SPX options traded on the C1 exchange by notional value.

-The use of CBOE’s internal data offers a more accurate and complete view of retail investor behavior compared to earlier studies relying on proxies.

-Overall, the findings indicate retail investors are more active, strategic, and integral to the options market than traditional views have assumed.

This research by the CBOE, using more complete data, sheds light on the behavior of retail options traders. It provides more insight into the changing dynamics of the options markets.

Reference

[2] Selina Han, Unveiling the Sophistication: Understanding Retail Investors’ Trading Behavior in the U.S. Options Market, May 2024, CBOE

Closing Thoughts

In summary, the first article serves as a solid introduction to behavioral finance, focusing on the common mistakes that lead to investor losses. The second article presents CBOE research that, using more complete data, offers a clearer view of retail options trading behavior and the evolving structure of the options market.

The Rise of 0DTE Options: Cause for Concern or Business as Usual?

Zero DTE (Days to Expiration) options are contracts that expire on the same day they are traded. They were introduced in 2022 and have been gaining popularity. In this post, I discuss their impact on the market and how options traders use them.

Impact of Zero DTE Options on the Market

Zero DTE (0DTE) options, also known as “same-day expiration” options, are financial derivatives with expiration dates on the same day they are traded. These options offer traders the opportunity to profit from short-term price movements in the underlying asset.

The increase in the trading volume of 0DTE options has sparked various concerns among market participants and prompted intense discussions in the media. The main concern revolves around the potential destabilization of the underlying market due to large open positions in 0DTE and other short-term options.

Reference [1] examines whether 0DTE options significantly impact the market.

Findings

-The study finds no evidence that higher 0DTE options open interest gamma increases underlying index volatility; rather, it is linked to reduced intraday volatility levels.

-Volatility effects associated with 0DTE options do not extend into overnight sessions or impact lagged intraday volatility, suggesting limited temporal propagation.

-Positive shocks in 0DTE trading volume are followed by increased underlying market trading activity, although these effects are short-lived and not economically meaningful.

-Recent structural changes in the market have made underlying returns more sensitive to 0DTE trading volume shocks, yet the overall effect size remains minimal.

-The increased trading volume in 0DTE options does not lead to destabilizing market behaviors, challenging prevailing concerns in financial media and among market participants.

-On average, the difference in market response to 0DTE trading volume between earlier and more recent periods is only 0.1 standard deviations of absolute returns, indicating economic insignificance.

-Aggregate gamma in 0DTE options is inversely correlated with realized intraday volatility, suggesting a stabilizing rather than destabilizing influence on short-term market movements.

-The volatility risk premium associated with 0DTE options is notably high, reflecting the elevated compensation demanded by sellers for bearing short-term option risk.

-Contrary to expectations, 0DTE options do not amplify market moves through delta-hedging activity, despite their high gamma and rapid decay characteristics.

-The findings support the view that 0DTE options trading can coexist with stable market conditions, especially when managed within a robust market infrastructure.

In short, the paper concludes that 0DTE options do not destabilize the market. The increase in volume has an insignificant influence.

Reference

[1] Dim, Chukwuma, and Eraker, Bjorn and Vilkov, Grigory, 0DTEs: Trading, Gamma Risk and Volatility Propagation (2024).

Risk, Timing, and Strategy: Key Differences in 0DTE Options Trading Styles

The previous paper discussed the impact of 0DTE options on the market, drawing from both practitioner insights and academic literature. Both sources point to the conclusion that 0DTE options have little or almost no impact on the market; they do not increase market volatility, contrary to what many investors have argued.

The CBOE recently updated its report [2] with new data, which reconfirmed that 0DTE options have little or no impact,

Findings

-Updated CBOE data confirms that zero-DTE options have minimal impact on market volatility, countering common concerns that high trading volumes lead to destabilization.

-The market risk from zero-DTE options depends on the balance of buying and selling, rather than the notional trading volume, which is typically well-distributed.

-SPX zero-DTE options show balanced flow between puts and calls, keeping the put/call ratio near one, unlike non-0DTE options, which are more skewed toward hedging.

-Due to this balanced flow, net gamma exposure from zero-DTE options remains minimal, reducing concerns about market maker-driven volatility.

-Both institutional and retail investors use zero-DTE options for tactical bets and systematic yield strategies, highlighting their broad appeal and diverse applications.

-Institutional investors prefer vertical spreads and tend to initiate trades earlier in the day, maintaining positions longer, suggesting greater risk capacity or hedging alternatives.

-Retail traders are more active during the market open and close, often engaging in complex strategies like iron condors and butterflies, reflecting hands-on risk management.

-The intraday behavior of retail traders—frequent opening and closing of positions—implies lower risk tolerance and a more active trading style.

-Despite using similar option structures, institutional and retail investors differ significantly in execution timing and approach to managing market exposure.

-The findings underscore that strategy similarity does not equate to identical trading behavior, with risk management practices and timing being key differentiators between investor types.

The report highlights that while the strategies are broadly similar, the approach to timing and risk management differs meaningfully between institutional and retail investors.

Reference

[2] 0DTEs Decoded: Positioning, Trends, and Market Impact, CBOE, May 2025

Closing Thoughts

The research and insights from both academia and CBOE confirm that zero DTE options do not destabilize markets, despite growing volumes and media attention. The key factor is the balanced dynamic of SPX 0DTE flows between buyers and sellers, which minimizes net gamma exposure and reduces market impact. While retail and institutional investors differ in timing and strategy preferences, their overall usage remains systematic and diversified.

How Machine Learning Enhances Market Volatility Forecasting Accuracy

Machine learning has many applications in finance, such as asset pricing, risk management, portfolio optimization, and fraud detection. In this post, I discuss the use of machine learning in forecasting volatility.

Using Machine Learning to Predict Market Volatility

The unpredictability of the markets is a well-known fact. Despite this, many traders and portfolio managers continue to try to predict market volatility and manage their risks accordingly. Usually, econometric models such as GARCH are used to forecast market volatility.

In recent years, machine learning has been shown to be capable of predicting market volatility with accuracy. Reference [1] explored how machine learning can be used in this context.

Findings

-Machine learning models can accurately forecast stock return volatility using a small set of key predictors: realized volatility, idiosyncratic volatility, bid-ask spread, and returns.

-These predictors align with existing empirical findings, reinforcing the traditional risk-return trade-off in finance.

-ML methods effectively capture both the magnitude and direction of predictor impacts, along with their interactions, without requiring pre-specified model assumptions.

-Large current-period volatility values strongly predict higher future volatility; small values have a muted or negative impact.

-LSTM models outperform feedforward neural networks and regression trees by leveraging temporal patterns in historical data.

-An LSTM using only volatility and return history over one year performs comparably to more complex models with additional predictors.

-LSTM models function as distribution-free alternatives to traditional econometric models like GARCH.

-Optimal lag length remains critical in LSTM performance and must be selected through model training.

-The study reports an average predicted realized volatility of 44.1%, closely matching the actual value of 43.8%.

-Out-of-sample R² values achieved are significantly higher than those typically reported in related volatility forecasting literature.

In short, the paper aimed to demonstrate the potential of machine learning for modeling market volatility. In particular, the authors have shown how the LSTM model can be used to predict market volatility and manage risks. The results suggest that this is a promising alternative approach to traditional econometric models like GARCH.

Reference

[1] Filipovic, Damir and Khalilzadeh, Amir, Machine Learning for Predicting Stock Return Volatility (2021). Swiss Finance Institute Research Paper No. 21-95

Machine Learning Models for Predicting Implied Volatility Surfaces

The Implied Volatility Surface (IVS) represents the variation of implied volatility across different strike prices and maturities for options on the same underlying asset. It provides a three-dimensional view where implied volatility is plotted against strike price (moneyness) and time to expiration, capturing market sentiment about expected future volatility.

Reference [2] examines five methods for forecasting the Implied Volatility Surface of short-dated options. These methods are applied to forecast the level, slope, and curvature of the IVS.

Findings

-The study evaluates five methods—OLS, AR(1), Elastic Net, Random Forest, and Neural Network—to forecast the implied volatility surface (IVS) of weekly S&P 500 options.

-Forecasts focus on three IVS characteristics: level, slope, and curvature.

-Random Forest consistently outperforms all other models across these three IVS dimensions.

-Non-learning-based models (OLS, AR(1)) perform comparably to some machine learning methods, highlighting their continued relevance.

-Neural Networks forecast the IVS level reasonably well but perform poorly in predicting slope and curvature.

-Elastic Net, a linear machine learning model, is consistently outperformed by the non-linear models (Random Forest and Neural Network) for the level characteristic.

-The study emphasizes the importance of model selection based on the specific IVS characteristic being forecasted.

-Performance evaluation is supported using the cumulative sum of squared error difference (CSSED) and permutation variable importance (VI) metrics.

-The research highlights the utility of Random Forest in capturing complex, non-linear patterns in IVS dynamics.

-Accurate IVS forecasting is valuable for derivative pricing, hedging, and risk management strategies.

This research highlights the potential of machine learning in forecasting the implied volatility surface, a key element in options pricing and risk management. Among the five methods studied, Random Forest stands out as the most consistent and accurate across multiple IVS features.

Reference

[2] Tim van de Noort, Forecasting the Characteristics of the Implied Volatility Surface for Weekly Options: How do Machine Learning Methods Perform? Erasmus University, 2024

Closing Thoughts

These studies highlight the growing effectiveness of machine learning in financial forecasting, particularly for market volatility and implied volatility surfaces. Models like LSTM and Random Forest demonstrate clear advantages over traditional methods by capturing complex patterns and dependencies. As financial markets evolve, leveraging such tools offers a promising path for enhancing predictive accuracy and risk management.