Overfitting and Parameter Selection in Trading Strategies

The risk of overfitting is serious and can lead to significant losses. It has been discussed in previous posts. In this edition, we revisit the topic, given its continued relevance to quantitative strategy development.

Formal Study of Overfitting in Trading System Design

A serious problem when designing a trading system is the overfitting phenomenon, wherein the system is excessively tuned to historical data. Overfitting occurs when a trading strategy performs exceptionally well on past data but fails to generalize to new, unseen data. This can lead to false positives and inflated expectations, as the system may appear profitable due to chance rather than true predictive power.

Reference [1] formally studied this issue, using analytical approximations for the in-sample and out-of-sample Sharpe ratios of portfolios.

Findings

-The paper analyzes how the in-sample performance of trading strategies based on linear predictive models deteriorates out-of-sample due to overfitting.

-It develops closed-form approximations for both in-sample and out-of-sample Sharpe ratios by modeling the means and variances of strategy PnLs.

-The results show that strategies using a large number of assets and weak signals experience a significant decline in out-of-sample performance.

-In contrast, strategies relying on fewer but stronger signals tend to exhibit more stable and replicable results.

-Increasing the size of the training dataset improves the out-of-sample replication ratio and reduces overfitting risk.

-Signals with low true Sharpe ratios are particularly prone to overfitting, leading to inflated in-sample performance that does not persist.

-Simulation and empirical studies, including applications to commodity futures, confirm the magnitude and robustness of these effects.

-The findings also show that incorporating more realistic signal dynamics does not materially alter the main conclusions.

-The replication ratio is largely determined by the true out-of-sample Sharpe ratio rather than specific model assumptions.

-Overall, the study suggests that controlling model complexity and maximizing data usage are key to mitigating overfitting in predictive trading strategies.

In summary, the paper formally demonstrated that to minimize the risk of overfitting, one should,

  1. Keep models as simple as possible,
  2. Use the longest sensible backtest period available,
  3. Develop systems with high Sharpe ratios, and
  4. Rely on fewer signals.

From our experience, we have reservations about points #3 and #4, while agreeing with points #1 and #2. What do you think?

Reference

[1] Antoine Jacquier, Johannes Muhle-Karbe, Joseph Mulligan, In-Sample and Out-of-Sample Sharpe Ratios for Linear Predictive Models, 2025, arXiv:2501.03938

Avoiding Overfitting: Searching for Parameter Plateau

To mitigate the risk of overfitting, system developers often employ techniques such as cross-validation and out-of-sample testing to ensure that their strategies remain robust across various market conditions and time periods.

Another technique to prevent overfitting involves selecting a parameter region, often referred to as a “plateau,” where the trading system maintains stable performance. Reference [2] introduced a method for quantifying this plateau and utilized particle-swarm optimization to search for it.

Findings

-The study highlights that quantitative trading performance depends heavily on parameter selection and is vulnerable to overfitting.

-It introduces the concept of a parameter plateau to identify stable and robust parameter regions rather than single optimal points.

-A plateau score algorithm is developed to replace the conventional approach of selecting the best in-sample parameters.

-The results show that parameters with high plateau scores exhibit more stable and consistent out-of-sample performance.

-The approach helps avoid “parameter islands” that perform well in-sample but fail out-of-sample.

-To improve search efficiency, the study applies particle swarm optimization instead of brute-force methods.

-Particle swarm optimization enables faster exploration of high-dimensional parameter spaces.

-Experiments demonstrate that the combined plateau and optimization approach improves both robustness and profitability.

-The method remains effective as strategy complexity increases from low- to high-dimensional parameter settings.

-The study also proposes suitable hyperparameter ranges for particle swarm optimization in this framework.

In short, the extent of plateau stability is quantified, and an efficient optimization algorithm is utilized to search for it. The out-of-sample test results show promise.

Reference

[2] Jimmy Ming-Tai Wu, Wen-Yu Lin, Ko-Wei Huang, Mu-En Wu, On the design of searching algorithm for parameter plateau in quantitative trading strategies using particle swarm optimization, Knowledge-Based Systems, Volume 293, 7 June 2024, 111630

Closing Thoughts

Taken together, these studies highlight that both model design and parameter selection are key sources of fragility in quantitative strategies. Overfitting arises not only from using too many weak signals but also from selecting unstable parameter configurations that fail to generalize out-of-sample. Approaches such as reducing model complexity, increasing data, and focusing on stable parameter regions through the concept of parameter plateaus offer practical ways to improve robustness. Overall, the evidence suggests that consistent performance depends less on optimizing in-sample results and more on ensuring stability across regimes and datasets.

Volatility Risk Premium and Clustering: Intraday vs Overnight Dynamics

The decomposition of risks and returns into overnight and intraday components is an emerging area of research. In this post, we examine how these components differ in terms of volatility clustering and the variance risk premium, and what this implies for forecasting, risk management, and strategy design.

Breaking Down the Volatility Risk Premium: Overnight vs. Intraday Returns

The decomposition of the volatility risk premium (VRP) into overnight and intraday components is an emerging area of research. Most studies indicate that the VRP serves as compensation for investors bearing overnight risks.

Reference [1] continues this line of research, with its main contribution being the decomposition of the variance risk premium into overnight and intraday components using a variance swap approach. The study also tests the predictive ability of these components and examines the seasonality (day-of-week effects) of the VRP.

Findings

-The paper decomposes the variance risk premium into overnight and intraday components across the US, Europe, and Asia.

-It finds that the variance risk premium is significantly negative during the overnight non-trading period.

-During the intraday trading period, the variance risk premium becomes positive and often insignificant.

-The results show that the overall negative variance risk premium documented in prior studies is largely driven by the overnight component.

-The study uses the P&L of a hypothetical variance swap to analyze these components.

-The intraday variance risk premium captures short-term risk and has predictive power over 1 to 3-month horizons.

-The overnight variance risk premium reflects longer-term risk and shows predictive ability over 6 to 12-month horizons.

-The findings highlight the importance of non-trading periods in explaining the behavior of the variance risk premium.

In summary, the study reaffirms that the variance risk premium is significantly negative during the non-trading overnight period, while it becomes positive and often insignificant during the intraday trading period.

An interesting finding is the day-of-week seasonality. For instance, going long volatility at the open and closing the position at the close tends to be profitable on most days, except Fridays.

Reference

[1] Papagelis, Lucas and Dotsis, George, The Variance Risk Premium Over Trading and Non-Trading Periods (2024). SSRN 4954623

Intraday and Overnight Volatility Clustering Effect

Volatility clustering is a phenomenon observed in financial markets where periods of high volatility tend to cluster together, followed by periods of low volatility. This pattern suggests that extreme price movements are not randomly distributed over time but rather occur in clusters or groups.

Volatility clustering has undergone extensive study within the daily timeframe. Reference [2] delves into volatility clustering within intraday and overnight timeframes. It specifically investigates clustering within each timeframe and between them.

Findings

-The paper studies volatility clustering in intraday and overnight returns across 15 global equity markets.

-It finds that volatility clustering is present in both intraday and overnight returns across multiple time scales, from daily to long-term horizons.

-The results show that volatility clustering is generally stronger in overnight returns than in intraday returns.

-Cross clustering between intraday and overnight volatility is relatively weak within each market, especially at shorter time scales.

-The findings are consistent across both developed and emerging markets, indicating a universal pattern.

-The study highlights the importance of considering both short-term and long-term risks in equity markets.

-The results suggest that volatility dynamics differ between trading and non-trading periods.

-The paper provides implications for trading and risk management strategies based on volatility clustering behavior.

In short, the paper shows that volatility clustering is a universal feature of both intraday and overnight returns across multiple time scales. It also finds that clustering is stronger overnight, while cross-effects between intraday and overnight volatility remain weak, with consistent patterns across global markets.

Reference

[2] Xiaojun Zhao, Na Zhang, Yali Zhang, Chao Xu, Pengjian Shang, Equity markets volatility clustering: A multiscale analysis of intraday and overnight returns, Journal of Empirical Finance 77 (2024) 101487

Closing Thoughts

Taken together, these studies show that volatility dynamics differ significantly between intraday and overnight periods, both in terms of risk pricing and clustering behavior. The variance risk premium is largely driven by the overnight component, while intraday and overnight volatility exhibit distinct clustering patterns with limited interaction. These findings highlight the importance of separating trading and non-trading periods in both forecasting and risk management, as each captures different horizons and sources of risk, offering more refined inputs for portfolio construction and strategy design.

Large Language Models in Trading: Models and Market Dynamics

I just returned from a two-day conference in New York, FutureAlpha (formerly QuantStrats). This year, the theme focused largely on data, machine learning, and AI. While some speakers were very enthusiastic about the potential of AI to generate alpha, our panel was more conservative. The consensus among the panelists was to use ML and AI to enhance and improve risk management. Along this theme, in this post, I discuss the use of generative AI in trading.

Integrating Structured and Unstructured Data with LLMs and RAG

Traditional quantitative methods often rely on structured data, such as time series. With the emergence of Large Language Models (LLMs), it is now possible to process unstructured data. A new line of research focuses on integrating unstructured data analysis into traditional frameworks.

Along this line, Reference [1] proposed the use of LLMs together with retrieval-augmented generation (RAG) to process both structured and unstructured data concurrently. Specifically, the authors developed a system that first applies LLMs to detect regime shifts using time-series techniques, then employs RAG to integrate external knowledge into the model’s decision-making process. By retrieving relevant information from a vector database and combining it with the model’s capabilities, RAG improves both the interpretability and effectiveness of trading strategies.

Findings

-The paper studies methods for fine-tuning open-source Large Language Models to enhance quantitative trading strategies.

-It integrates numerical data, such as prices and technical indicators, with textual data, including news and sentiment.

-The approach uses Retrieval-Augmented Generation with a vector database to process and contextualize textual information.

-The study focuses on fully fine-tuning smaller models to achieve cost efficiency and scalability.

-It proposes a hybrid framework that combines LLM capabilities with traditional quantitative methods.

-The framework incorporates real-time data pipelines and adaptive model tuning.

-The results show improvements in predictive accuracy and risk-adjusted returns.

-The integration of multimodal data helps address challenges in combining structured and unstructured information.

-Fine-tuned smaller models improve regime detection and trading decision accuracy while maintaining efficiency.

-Additional techniques enhance model performance and robustness, supporting practical applications in quantitative finance.

In short, incorporating RAG into the framework enhances the model’s ability to understand complex macroeconomic environments and adapt trading strategies as conditions evolve. Experimental results show significant gains in predictive accuracy and risk-adjusted returns, demonstrating the practical value of these fine-tuning methods in finance.

Reference

[1] Li, C., Chan, C.H.R., Huang, S.H., Choi, P.M.S. (2025). Integrating LLM-Based Time Series and Regime Detection with RAG for Adaptive Trading Strategies and Portfolio Management. In: Choi, P.M.S., Huang, S.H. (eds) Finance and Large Language Models. Blockchain Technologies. Springer, Singapore.

Can AI Trade? Modeling Investors with Large Language Models

The previous paper focuses on improving trading performance by integrating LLMs with quantitative models and data, while another line of research explores how LLMs behave as autonomous agents within market environments.

Reference [2] utilized LLMs to construct trading agents in the financial markets. Specifically, the author used LLMs to emulate various types of investors: value investors, momentum traders, market makers, retail traders, etc.

Findings

-The paper develops a simulated stock market in which large language models act as heterogeneous trading agents.

-The framework includes realistic market features such as an order book, market and limit orders, partial fills, dividends, and equilibrium clearing.

-Agents operate with different strategies, information sets, and endowments, and communicate decisions using structured outputs while explaining reasoning in natural language.

-The results show that LLMs can consistently follow instructions and implement strategies such as value investing, momentum trading, and market making.

-LLM agents process market information and respond meaningfully to prices, dividends, and historical data.

-The simulated market exhibits realistic dynamics, including price discovery, bubbles, underreaction, and liquidity provision.

-The framework enables controlled analysis of agent behavior under different market conditions, similar to interpretability methods in machine learning.

-It provides a cost-effective way to test financial theories that lack closed-form solutions.

-The study highlights that LLM behavior is highly sensitive to prompts, which can lead to correlated actions across agents.

-This correlation may amplify volatility and introduce systemic risks, emphasizing the need for careful testing before real-world deployment.

In short, the article concluded that trading strategies generated by large language models are effective, but could introduce new systemic risks to financial markets because these agents would act in a correlated manner.

Reference

[2] Alejandro Lopez-Lira, Can Large Language Models Trade? Testing Financial Theories with LLM Agents in Market Simulations, arXiv:2504.10789

Closing Thoughts

In this issue, the discussion highlights two complementary directions in applying LLMs to finance. On one hand, integrating LLMs with quantitative models and multimodal data can improve predictive accuracy and risk-adjusted returns. On the other hand, treating LLMs as autonomous trading agents reveals how their behavior can shape market dynamics, including liquidity, price discovery, and potential instability. Taken together, the results suggest that while LLMs offer meaningful opportunities in trading and risk management, their impact depends critically on implementation, prompting, and control of system-wide behavior.

Evaluating Option-Based Strategies and Dollar-Cost Averaging

In past issues, we discussed popular investment strategies such as covered calls and collars. In this post, we continue by examining other strategies, focusing on their performance, limitations, and how they behave under different market conditions.

Reexamining the Performance of Passive Options Strategies

More than 40 years ago, Merton et al. published two papers [1,2] examining the performance of passive options strategies. They concluded that these strategies outperformed the traditional buy-and-hold approach. At the time of their studies, options data was not widely available, so they used historical volatility to calculate options prices. Merton et al. conducted their research by simulating the impact of options on two portfolios: a broad market proxy of 136 equities and the Dow Jones 30 index. Using a twelve-year period, the backtest incorporated historical volatility and applied the Black–Scholes-Merton model to price the options.

Since then, the options market has become highly liquid, with significant structural changes. A recent article [3] reexamines the strategies studied by Merton et al., along with additional strategies, using actual options data from the period 2012 to 2023. The strategies studied include Call-Write strategies (with seven variants), Put-Write strategies (with two variants), and the Protective Put (PPUT) strategy.

Findings

-Early studies showed that passive option strategies could outperform the underlying index on a risk-return basis.

-The options market has evolved significantly, from open outcry and single listings to a high-frequency, electronic environment.

-The findings suggest that the original strategies no longer provide favorable risk-adjusted returns and that earlier results may have been driven by simplifying assumptions.

-Recent evidence indicates that simple option strategies generally do not add value to portfolios.

-However, certain dynamic option strategies can still outperform the S&P 500 on a risk–return basis.

-Incorporating simple market regime signals can improve the performance of these strategies.

-The PPUT strategy consistently outperforms the S&P 500 on a risk-adjusted basis.

-A modified PPUT strategy, which avoids puts after a one-standard-deviation drawdown, delivers higher returns with lower risk.

-The outperformance may be driven by the widespread use of covered call strategies, which suppress implied volatility and underprice tail risk.

In short, none of the simple options strategies have outperformed the S&P 500. Interestingly, the PPUT strategy outperforms the buy-and-hold approach on a risk-adjusted basis, and the VIX is shown to be an effective regime filter.

Reference

[1] Merton, Robert C., Myron S. Scholes, and Mathew L. Gladstein. 1978. The Returns and Risk of Alternative Call Option Portfolio Investment Strategies. Journal of Business 51: 183–242.

[2] Merton, Robert C., Myron S. Scholes, and Mathew L. Gladstein. 1982. The Returns and Risks of Alternative Put-Option Portfolio Investment Strategies. Journal of Business 55: 1–55.

[3] Andrew Kumiega, Greg Sterijevski, and Eric Wills, Black–Scholes 50 Years Later: Has the Outperformance of Passive Option Strategies Finally Faded?, International Journal of Financial Studies 12: 114.

The Effectiveness of Dollar Cost Averaging Under Varying Market Conditions

Dollar-cost averaging (DCA) is an investment strategy where you invest a fixed amount of money at regular intervals, regardless of market conditions. By consistently buying over time, you smooth out entry prices, reduce the impact of short-term volatility, and avoid the risk of mistiming the market with a single large purchase.

DCA has often been presented as an effective portfolio management technique, and financial advisors and brokers encourage clients to adopt it. But is it truly effective, or merely a marketing scheme?

Reference [4] critically examined this question. The study employed Monte Carlo simulation, rather than historical backtesting, to explore the issue. Specifically, the authors utilized Geometric Brownian Motion (GBM) to simulate stock prices under various market conditions.

Findings

-DCA involves investing a fixed amount at regular intervals and is commonly used for risk mitigation.

-The analysis uses Monte Carlo simulations based on geometric Brownian motion to generate price paths.

-The study compares DCA with a Buy-and-Hold (B&H) strategy across varying levels of market drift and volatility.

-The results show that DCA underperforms B&H in steadily rising and stable markets.

-DCA provides better risk-adjusted performance in highly volatile market environments.

-Market volatility and transaction frequency are key drivers of DCA performance.

-Lower transaction frequency improves the effectiveness of the DCA strategy.

-The study highlights the importance of adjusting DCA parameters based on market conditions.

In brief, DCA is effective when volatility is high; otherwise, it underperforms buy-and-hold. Further, the paper leads to interesting questions about the validity of position-sizing techniques such as scaling in and out.

Reference

[4] Siyuan Sang, Ru Bai, Haibo Li, The Dynamic Relationship Between Market Volatility and Dollar Cost Averaging Strategy Returns: An Empirical Investigation, in Proceedings of the 2025 3rd International Academic Conference on Management Innovation and Economic Development (MIED 2025)

Closing Thoughts

In this issue, two strands of research on investment strategies are discussed. The first revisits option-based strategies and shows that simple passive approaches no longer deliver attractive risk-adjusted returns, although more dynamic strategies, especially those incorporating regime signals, can still add value. The second examines Dollar-Cost Averaging and shows that its effectiveness is highly dependent on market conditions, underperforming in stable markets but offering advantages in volatile environments. Taken together, the results suggest that simple, static strategies are no longer sufficient, and that performance increasingly depends on adapting to market regimes and implementation details.

Machine Learning for Derivative Pricing and Crash Prediction

Applications of machine learning in finance continue to evolve rapidly. In previous posts, we discussed both the uses and the challenges of applying machine learning in financial markets. In this installment, we continue that discussion by highlighting new research on machine learning approaches for pricing complex derivatives and identifying signals that may precede major market downturns.

Speeding Up Derivatives Pricing Using Machine Learning

A financial derivative is a contract whose value depends on the price of an underlying asset such as a stock, bond, commodity, or index. Accurate valuation of financial derivatives and their associated sensitivity factors is important for both investment and hedging purposes. However, many complex derivatives exhibit path-dependency and early-exercise features, which means that closed-form solutions rarely exist, and numerical methods must be used.

The issue with numerical methods is that they are often slow. As a result, efforts are being made to improve the efficiency of numerical techniques for valuing financial derivatives. Reference [1] proposed a fast valuation method based on machine learning. It developed a hybrid two-stage valuation framework that applies a machine learning algorithm to highly accurate derivative valuations incorporating full volatility surfaces. The volatility surface is parameterized, and a Gaussian Process Regressor (GPR) is trained to learn the nonlinear mapping from the complete set of pricing inputs directly to the valuation outputs. Once trained, the GPR delivers near-instantaneous valuation results.

Findings

-The study develops a machine learning framework for pricing derivative products whose valuation depends on volatility surfaces.

-Volatility surfaces are parameterized using the five-parameter SVI model with a one-factor term structure adjustment to generate realistic synthetic market scenarios.

-High-accuracy valuations for variance swaps and American put options are computed using conventional numerical methods and used to create training and testing datasets.

-A Gaussian Process Regressor is trained to learn the nonlinear relationship between input risk factors, such as volatility surface parameters, strike, and interest rate, and valuation outputs including prices and Greeks.

-The trained model achieves high accuracy, with approximately 0.5% relative error for variance swap fair strikes and 1.7–3.5% relative error for American put prices and first-order Greeks.

-The model is less accurate for the Gamma Greek due to discontinuities in the strike dimension.

-After training, the machine learning model produces valuations almost instantly, achieving a speed improvement of three to four orders of magnitude compared with traditional numerical methods.

-The results demonstrate that machine learning can enable real-time risk analytics, dynamic hedging, and large-scale scenario analysis for derivatives.

-The framework is general and can be extended to other path-dependent derivatives with early exercise features.

In summary, the authors developed an efficient method to price complex financial derivatives using a machine learning technique. However, it is noted that GPR’s performance in valuing higher-order greeks is noticeably less accurate. Additionally, the study was conducted using synthetic data, so it would be useful to see the method applied to real-world scenarios.

Reference

[1] Lijie Ding, Egang Lu, Kin Cheung,  Fast Derivative Valuation from Volatility Surfaces using Machine Learning, arXiv:2505.22957

Forecasting Market Crashes with Machine Learning Techniques

Reference [2] examines how machine learning can be used to predict market crashes within the Adaptive Market Hypothesis framework.

The study considers three categories of factors:

  1. Internal factors, such as technical indicators designed to capture endogenous market dynamics, including momentum, trend strength, and money flow arising from investor behavior and adaptive learning;
  2. External factors, including macroeconomic and commodity variables that proxy for systematic, exogenous risks affecting fundamental valuations; and
  3. Volatility features that quantify market fear and uncertainty.

The author evaluates the performance of three predictive models—logistic regression, random forest, and a long short-term memory (LSTM) network.

Findings

-While the Efficient Market Hypothesis suggests crashes cannot be predicted, the Adaptive Market Hypothesis allows for temporary periods of predictability as market conditions evolve.

-The analysis compares a traditional econometric model, Logistic Regression, with machine learning approaches, including Random Forest and LSTM.

-The models use a feature set combining technical, macroeconomic, and volatility-based indicators.

-Model performance is evaluated using metrics designed for imbalanced classification problems, where crash events are rare but economically significant.

-Empirical results show that the LSTM provides the best balance between precision and recall, although Logistic Regression remains competitive.

-The findings highlight that simpler models can still perform effectively, supporting the value of model parsimony in turbulent market environments.

-The results also support the Adaptive Market Hypothesis by showing that market predictability evolves over time and depends on changing conditions.

-Logistic Regression performs well as an early-warning system due to its high recall, although it generates many false positives.

-The LSTM model improves precision while maintaining strong recall, suggesting that capturing temporal patterns in financial data enhances predictive performance.

-Overall, the study concludes that market crashes are not entirely random, but their prediction depends on the appropriate balance between model complexity and practical application.

In short, the study concludes that market crashes are difficult to forecast but not entirely random, and different models capture different aspects of predictability. Logistic regression functions well as a high-recall early warning tool, while LSTM models provide more balanced signals.

Reference

[2] Michele Della Mura, Predicting Stock Market Crashes, A Comparative Analysis of Econometric and Machine Learning Models, Politecnico di Torino, 2025

Closing Thoughts

Taken together, these studies illustrate the expanding role of machine learning in modern quantitative finance. One line of research demonstrates how machine learning models can dramatically accelerate the pricing of complex derivatives while maintaining high accuracy, enabling real-time risk management and large-scale scenario analysis. Another line of work explores the ability of both traditional econometric methods and advanced machine learning models to identify signals that may precede market crashes. Collectively, these findings show that machine learning is reshaping financial modeling, though simpler approaches can still play a meaningful role.

Do Options Exhibit Momentum?

Momentum has been studied extensively across equities, commodities, and other asset classes, with well-documented evidence of cross-sectional and time-series continuation effects. More recently, an emerging line of research has shifted attention to momentum in option returns, examining whether derivative markets exhibit their own systematic return patterns.

In this post, we review the latest evidence on option return momentum across both monthly and intraday horizons and assess the economic mechanisms that may explain these persistent dynamics.

Momentum in the Option Market

In the financial market, momentum is the tendency for assets to continue moving in the same direction. It is a reflection of the underlying strength or weakness of an asset’s price action and can be used to identify trends. Momentum is one of the most pervasive market phenomena and can be observed in nearly all stock markets around the world.

Does this anomaly exist in other asset classes?

Reference [1] studied momentum in the options market. It examined the returns of delta-neutral straddles on individual equities.

Findings

-The study finds strong evidence of momentum in option returns, as options that performed well over the previous 6 to 36 months tend to generate high returns in the subsequent month.

-Momentum is observed under both cross-sectional and time-series definitions of past performance.

-The strategy is profitable across all five-year subsamples and carries significantly lower risk than short straddle positions on the S&P 500 Index or individual stocks.

-There is no evidence of momentum crashes in option returns, although the sample length may limit detection of such events.

-The authors find limited evidence of short-term cross-sectional reversal, where options that outperform in one month may underperform in the following month.

-There is no evidence of long-run reversal in option returns, in contrast to equities, and momentum persists even at 2- to 3-year horizons.

-Option momentum differs from stock momentum because the results are based on delta-hedged positions and remain robust after controlling for stock momentum effects.

-Momentum profits are unaffected by controls for implied versus historical volatility and other option characteristics, and remain significant after factor risk adjustments.

-The study shows that high historical-return options significantly outperform low-return options across multiple horizons, including when using out-of-the-money options or delta-hedged returns.

In short, like in equities, options also exhibit momentum. The options momentum is mean-reverting in the short term and trending in the long term.

Reference

[1] Heston, Steven L. and Jones, Christopher S. and Khorram, Mehdi and Li, Shuaiqi and Mo, Haitao, Option Momentum (2022), SSRN 4113680

Momentum in the Option Market, Intraday Case

While the previous article examines momentum in option returns across monthly horizons, Reference [2] extends this line of research by focusing on intraday option return dynamics.

Findings

-The paper documents novel seasonal patterns in intraday returns of individual stock option straddles.

-Despite being delta-neutral, straddle returns exhibit the same persistent intraday seasonality as their underlying stocks.

-Returns in a given half-hour interval predict returns in the same interval on the following trading day.

-This continuation effect is strongest at the market open and close, referred to as morning and afternoon momentum.

-Morning momentum is attributed to investors’ underreaction to volatility shocks.

-Afternoon momentum is driven by persistent inventory management behavior by option market makers.

In summary, it was shown that a straddle’s return during a particular 30-minute trading interval today positively predicts its return during the same interval on subsequent days. Morning momentum reflects a continued under-reaction to overnight volatility news. Afternoon momentum, on the other hand, is attributed to persistent price pressure caused by inventory management from option market makers.

Reference

[2] Da, Zhi and Goyenko, Ruslan and Zhang, Chengyu, Intraday Option Return: A Tale of Two Momentum (2024), SSRN 5018430

Closing Thoughts

Taken together, these studies show that option markets exhibit systematic return patterns across both monthly and intraday horizons. Momentum persists over 6 to 36 months without the long-run reversals observed in equities, while intraday straddle returns display predictable continuation at the market open and close.

The evidence suggests that option return dynamics are driven by distinct forces, including behavioral underreaction, inventory management by market makers, and structural features of volatility trading. Collectively, these findings reinforce the view that options are not merely derivatives of stocks, but markets with their own persistent and economically meaningful return patterns.

Herding in Commodities and Cryptocurrencies

Herding behavior has been extensively studied and is well understood in equity markets, but far less so in other asset classes such as commodities and cryptocurrencies. In this post, we explore key aspects of herding behavior in crypto and commodity markets.

Investor Behavior in Crypto During Geopolitical Shocks

Herd behavior refers to the tendency of investors to follow the actions of a larger group, often ignoring their own analysis or information. This collective movement can lead to asset bubbles during bull markets and sharp sell-offs during downturns. Understanding herd behavior is essential for identifying potential mispricings and avoiding emotionally driven decisions.

Herding behavior has been well studied in the equity markets, but less so in the cryptocurrency market. One might expect stronger herding in crypto due to the prevalence of young, inexperienced traders and the fact that crypto markets are under-regulated, less transparent, and highly volatile. However, existing studies have produced inconclusive results.

Reference [1] extends the research on herding in the crypto space by examining behavior during major geopolitical events, such as the COVID-19 pandemic and the Russia–Ukraine war.

Findings

-The study finds strong evidence of market-wide herding behavior in cryptocurrency markets by analyzing the relationship between return dispersion and market returns.

-Geopolitical risk (GPR) significantly amplifies herding, with severe herding detected across nearly all model specifications.

-The GPR Threat index has a stronger impact on herding than the GPR Act index, indicating that perceived geopolitical threats matter more than realized events.

-Herding behavior is asymmetric, occurring more intensely during bearish market conditions than bullish ones.

-Imitative trading is particularly pronounced during periods of market stress, confirming the presence of asymmetric herding.

-The strongest herding effects are observed during extreme geopolitical and global events, notably the COVID-19 pandemic and the Russia–Ukraine war.

-The findings suggest that herding in cryptocurrency markets is largely intentional, reflecting low information symmetry, weak disclosure, and limited information quality.

-Actual geopolitical events (GPR Act) tend to lose explanatory power because market participants rapidly process and price in the information once it is released.

-When realized geopolitical shocks exceed investor expectations, uncertainty rises sharply and herding intensifies.

In short, the authors found that herding intensifies during such events and is clearly present throughout these periods.

Reference

[1] Phasin Wanidwaranan, Jutamas Wongkantarakorn, Chaiyuth Padungsaksawasdi, Geopolitical risk, herd behavior, and cryptocurrency market, The North American Journal of Economics and Finance Volume 80, September 2025, 102487

Does Herding Behavior Exist in the Commodity Markets?

Herding behavior has been shown to exist in equity markets. Reference [2] examines the herding behavior in the commodity markets.

Findings

-The study investigates herding behavior in commodity ETFs using high-frequency microstructure data and a GARCH model that incorporates cross-sectional and market volatility at 15-, 30-, 45-, and 60-minute intervals.

-During periods of market instability and the COVID-19 pandemic, agricultural and metal-based ETFs generally exhibit weaker herding behavior, while energy-based ETFs tend to herd more.

-Under normal market conditions, herding typically emerges at frequencies longer than 30 minutes.

-Broad basket commodity ETFs and energy-based ETFs display herding behavior across multiple frequencies rather than at a single time scale.

-A notable exception is agricultural ETFs during the COVID-19 pandemic, where herding is observed across all frequencies, representing a key and unusual finding.

-Correlation analysis shows that commodity ETFs become less correlated with each other as time progresses.

-Lower observation frequencies are associated with weaker correlations across ETFs, except in the energy sector.

-The results suggest that herding behavior varies significantly by commodity type, market regime, and observation frequency.

The findings provide insights for investors, economists, and policymakers, particularly for designing diversification, hedging strategies and mitigating risks such as asset price bubbles and financial instability.

Reference

[2] Ah Mand, Abdollah and Sifat, Imtiaz and Ang, Wei Kee and Choo, Jian Jing, Herding Behavior in Commodity Markets. SSRN 4502804

Closing Thoughts

Taken together, these two studies show that herding behavior extends well beyond equity markets and plays a meaningful role in both cryptocurrencies and commodity ETFs, particularly under stress. In crypto markets, herding is strongly amplified by geopolitical risk, bearish conditions, and extreme events. In commodity ETFs, herding is more nuanced and highly dependent on asset class, market regime, and trading frequency, with energy and broad commodity baskets exhibiting persistent herding, while agricultural and metal ETFs remain relatively resilient except during extreme volatility.

Overall, the evidence suggests that herding is regime-dependent, frequency-specific, and asset-class-specific, with important implications for risk management, diversification, and the design of trading and hedging strategies during periods of market stress.

Modern Pairs Trading: What Still Works and Why

Pairs trading, or statistical arbitrage (stat arb), is a classic, well-established quantitative trading strategy, and it is still in use today. I discussed its profitability in a previous post, and in this installment, we continue that discussion.

Pairs Selection Methods

Reference [1] provides a thorough review of the pairs trading literature between 2016 and 2023.

Pair selection is a critical step in pairs trading, and the paper offers a comprehensive review of the various pair selection methods used in practice. They are:

1-Distance Methods

Use SSE/SAE of normalized price differences to identify co-moving assets. Simple, intuitive, and historically profitable across markets, even after costs.

2-Cointegration Methods

Exploit long-run equilibrium relationships. Strong empirical support across equities and bonds, with advances in regime switching and external-factor integration.

3-Stochastic Control Methods

Model pairs trading as a continuous-time optimization problem. Incorporate jumps, regime changes, and stochastic volatility, showing strong performance but facing practical frictions.

4-Time Series Methods

Use GARCH, OU, and fractional OU to model short-term dynamics and volatility clustering. Adaptive thresholds improve returns; hybrid models are an emerging area.

5-Other Methods

Copulas capture tail dependence; Hurst exponent methods capture long memory; entropic approaches address model uncertainty. These improve robustness under nonlinear dynamics.

Overall, the review helps practitioners adapt stat-arb techniques to new markets and regimes. While simple methods once worked well, today’s competitive environment often requires more sophisticated approaches, though success still depends on model design, data quality, and market regime.

Profitability of Pairs Trading

There is an ongoing debate in the literature—some argue that “pairs trading is dead,” while others maintain that it remains profitable. From this review paper [1], we learn the following.

1- Pairs trading remains profitable, but returns are weaker and more conditional

The survey explicitly notes that profitability persists, but is not uniform and depends on market conditions, costs, and implementation details:

Empirical evidence consistently shows that distance-based pairs trading can be profitable across different markets, asset classes, and time horizons.

However, this is immediately tempered elsewhere by declining performance stability:

Performance is not uniform over time: profitability tends to vary with market volatility, and Sharpe ratios decline in certain subperiods.

  1. Transaction costs and competition materially erode profits

Modern profitability survives only after careful cost control, unlike the early 2000s results:

Even after accounting for realistic transaction costs, the strategy remains profitable in several markets.

  1. Advanced methods outperform naïve approaches

The paper makes clear that simple Gatev-style [2] implementations are no longer sufficient:

The apparent simplicity of GGR’s strategy becomes less evident as more sophisticated models and techniques have been introduced.

And later:

Regime-switching structures … demonstrate superior performance, particularly under frequent or pronounced regime shifts.

In short, the paper does not argue that pairs trading has stopped working, but it makes clear that the simple, mechanical versions that worked in the 1990s and early 2000s no longer deliver robust returns. Profitability today is weaker, highly dependent on market regimes, and much more sensitive to transaction costs and execution. What survives is not the original Gatev–Goetzmann–Rouwenhorst method, but more adaptive, model-driven implementations that account for changing volatility, correlations, and liquidity.

Reference

[1] Sun, Y. (2025). A survey of statistical arbitrage pairs trading strategies with non-machine learning methods, 2016-2023. WNE Working Papers, 19/2025 (482). Faculty of Economic Sciences, University of Warsaw

[2] Gatev, E., Goetzmann, W., & Rouwenhorst, K. G. (2006). Journal of Financial Economics, 81(1), 105–141.

Closing Thoughts

The paper provides a thorough review of all existing pair selection methods, which are critical to pairs trading. It also concludes that current profitability is weaker, highly dependent on market regimes, and significantly more sensitive to transaction costs and execution.

Implied vs. Realized Volatility in Delta Hedging Strategies

Delta hedging is a fundamental topic in portfolio and risk management. In this post, we discuss which volatility measure should be used in the delta hedging process, while a future edition will examine the appropriate hedging frequency and time horizon.

Which Free Lunch Would You Like Today Sir?

Reference [1] is a classic article on delta hedging that addresses the following question: if an investor has an accurate estimate of future realized volatility that differs from current implied volatility, a position can be initiated to exploit this discrepancy and then dynamically hedged—but which volatility should be used as the input in the hedging process?

Hedging with Actual Volatility

Pros

-Hedging with actual volatility guarantees the final profit at expiration, equal to the difference between theoretical option values under actual and implied volatility.

-The final profit has zero variance, making it attractive from a long-term, global risk–reward perspective.

-Expected profit is often insensitive to small errors in the volatility used for hedging, providing some robustness to estimation error.

Cons

-Mark-to-market P&L during the life of the option can fluctuate significantly, which is problematic for short-term risk management.

-Interim P&L depends on the true drift of the underlying asset, introducing uncertainty before expiration.

-In practice, traders are rarely confident in their estimate of actual volatility, weakening the appeal of this approach.

Hedging with Implied Volatility

Pros

-Mark-to-market P&L evolves smoothly with no random fluctuations, which is advantageous for daily risk monitoring.

-The trader only needs to be directionally correct about volatility (i.e., actual > implied or vice versa), not to estimate actual volatility precisely.

-Implied volatility is directly observable from the market, simplifying implementation.

Cons

-The final profit is path-dependent and therefore uncertain at inception.

-While profits are always positive in expectation, their magnitude cannot be known in advance.

-Profitability depends on the realized price path, particularly whether the underlying remains near regions of high gamma.

Reference

[1] R Ahmad, P Wilmott, Which Free Lunch Would You Like Today Sir?, Wilmott, 2005

Delta Hedging with Implied vs. Historical Volatility

Similar to the previous paper, Reference [2] examines the effectiveness of hedging using implied versus realized volatility. The study is based on empirical analysis using index ETF options, specifically the Nasdaq-100 ETF (QQQ).

Findings

-The study examines the role of volatility estimation in delta-neutral hedging, with a focus on short-term options trading and risk management.

-It empirically compares implied volatility (IV) and historical volatility (HV) using Nasdaq-100 ETF (QQQ) options over several months of daily data.

-The analysis evaluates hedging performance, return stability, transaction costs, hedging errors, and sensitivity under varying market volatility conditions.

-Results show that IV-based hedging delivers more stable returns, lower return volatility, and better risk mitigation, making it more suitable for conservative and risk-averse investors.

– IV-based strategies benefit from forward-looking, market-implied inputs, which improve delta accuracy, reduce rebalancing frequency, and lower transaction costs.

-HV-based hedging can generate higher potential returns but exhibits greater variability, larger hedging errors, and higher portfolio risk, particularly during volatile markets.

-Sensitivity tests confirm that IV adapts more effectively to changing market conditions than HV.

-The study highlights a clear trade-off between stability and return potential, emphasizing that volatility measure selection should depend on market conditions and risk preferences.

The findings provide practical guidance for traders and risk managers and contribute to the literature on optimal volatility modeling under real-world constraints. Though the paper has some limitations, notably the small sample size, this research direction is worth pursuing, particularly in establishing a delta band and determining the optimal hedging frequency.

Reference

[2] Yimao Zhao, Implied Volatility vs. Historical Volatility: Evaluating the Effectiveness of Delta-Neutral Hedging Strategies, Proceedings of the 2025 5th International Conference on Enterprise Management and Economic Development (ICEMED 2025)

Closing Thoughts

Taken together, these two studies highlight that the choice of volatility input is an important decision in delta hedging, rather than a technical detail. Both papers show that implied volatility, with its forward-looking and market-based nature, generally delivers more stable hedging performance, lower tracking errors, and better risk control, particularly in short-term and actively rebalanced strategies.

Historical or realized volatility, while simpler and sometimes effective in calmer market regimes, tends to lag during volatility shifts and leads to larger hedging errors. The broader implication for practitioners is that effective delta hedging requires aligning the volatility measure with market conditions, risk tolerance, and trading horizon, rather than relying on a one-size-fits-all approach.

Risk, Leverage, and Optimal Betting in Financial Markets

Most research in portfolio management focuses on alpha generation; however, another critical component of portfolio construction is position sizing. In this post, we examine key considerations in position sizing, including the Kelly criterion and the martingale betting system.

Does Kelly Portfolio Outperform the Market?

A method for capital allocation and position sizing is to employ the Kelly criterion. The Kelly criterion aims to optimize the expected growth rate of capital, maximizing the anticipated value of the logarithm of wealth. This strategy is rooted in John Kelly’s paper, “A New Interpretation of Information Rate.” According to Kelly, in repeated bets, a bettor should act to maximize the expected growth rate of capital, thus maximizing expected wealth at the end.

Reference [1] applies Thorp’s approach, as outlined in “The Kelly Criterion in Blackjack, Sports Betting and the Stock Market,” [2]  to construct a portfolio in the Norwegian stock market. The formula computes the optimal investment fraction in a set of assets, considering the expected excess returns of the assets and the inverse of the variance-covariance matrix.

Findings

-The study evaluates the performance of a growth-optimal Kelly portfolio in the Norwegian stock market over the period February 2003 to December 2022.

-It assesses abnormal performance using the CAPM, Fama–French three-factor model, and Carhart four-factor model.

-The Kelly portfolio achieves a higher compound annual growth rate (14.1%) and higher ending wealth than the benchmark index, which grows at 12%.

-It also outperforms a Markowitz portfolio, which delivers lower growth and final wealth.

-The Kelly portfolio and the benchmark exhibit similar Sharpe ratios (0.58), while the Kelly portfolio attains a higher Sortino ratio (0.95).

-Factor regressions indicate an annualized alpha of 16.8% for the Kelly portfolio, statistically significant at the 1% level before transaction costs.

-However, the factor models display very low explanatory power, suggesting that the estimated alpha may be overstated.

-Once transaction costs are incorporated, the Kelly portfolio no longer outperforms the benchmark in terms of final wealth.

-After costs, the alpha remains only marginally significant at the 10% level, implying limited real-world risk-adjusted excess returns.

This paper presents several interesting findings,

-First, the correlation of the Kelly portfolio with the market is nearly zero.

-Second, the performance is sensitive to transaction costs. We believe that with lower transaction costs, the Kelly portfolio has the potential to outperform the market and display zero correlation with it.

-Third, the Kelly portfolio surpasses the Markowitz mean-variance portfolio in performance.

We also concur with the author that the utilization of options can further enhance the risk-adjusted return.

Reference

[1] Jon Endresen and Erik Grødem, The Kelly criterion, an empricial study of the growth optimal Kelly portfolio, backtested on the Oslo Stock Exchange, 2023, Norwegian School of Economics.

[2] Thorp, E. O., The Kelly Criterion in Blackjack Sports Betting and the Stock Market, in: Zenios, S.A. & Ziemba, W.T., Handbook of Asset and Liability Management, Volume 1, 387–428, 2006

Enhanced Martingale Betting System with Stop Policy

The martingale betting system is a popular gambling strategy that involves doubling one’s wager after each loss in the pursuit of recovering previous losses and securing a profit equal to the original bet. The underlying idea is that, statistically, a win will eventually occur, allowing the player to recoup losses and gain a net profit equal to the initial stake. While simple in concept, the martingale system carries inherent risks, as it assumes unlimited funds for doubling bets and disregards the fact that losing streaks can persist longer than expected. Thus, this system will eventually result in bankruptcy.

Reference [3] however argues that different perspectives exist regarding whether stock price movements adhere strictly to a random walk, often modeled as a geometric Brownian motion. This suggests a potential for enhancement in the martingale betting system. The author has subsequently introduced an enhanced martingale betting system that includes a stop policy.

Findings

-The paper proposes an Improved Martingale Betting System (IMBS) by modifying the traditional martingale strategy with a stop policy and adapting it from casino gambling to intraday trading.

-The IMBS is empirically tested using TAIEX (TX) futures across three intraday trading strategies.

-Results show that the IMBS delivers strong performance and is applicable to TX intraday trading and related markets.

-The study finds that returns increase with leverage up to a certain threshold, beyond which traditional martingale strategies face a high probability of bankruptcy.

-By controlling key parameters—specifically leverage scaling (a), the number of steps (n), and total leverage—the IMBS significantly outperforms both the Equal-Weight Betting System (EWBS) and the traditional Martingale Betting System (MBS).

-The inclusion of a stop-loss mechanism further improves performance and risk control.

-Empirical tests indicate that IMBS performs particularly well when combined with price breakout strategies, which are identified as the most profitable approach for TX intraday trading.

In short, after testing on real data, the article concludes that

-The conventional martingale betting system inevitably leads to bankruptcy,

-With the integration of a stop policy, the new and improved martingale betting system demonstrates enhanced efficacy.

Reference

[3] Ting-Yuan Chen, and Szu-Lang Liao, Improved Martingale Betting System for Intraday Trading in Index Futures—Evidence of TAIEX Futures, Asian Journal of Economics and Business, Year:2023, Vol.4 (2), PP.339-366

Closing Thoughts

Taken together, the two studies highlight the trade-off between growth maximization and risk control in position sizing. The Kelly-based approach demonstrates strong theoretical and empirical growth performance, but its apparent alpha weakens once transaction costs and model limitations are accounted for, raising questions about real-world applicability. By contrast, the Improved Martingale Betting System shows that disciplined leverage control and stop policies can materially improve intraday trading outcomes relative to naive martingale schemes, especially when combined with breakout strategies. Overall, both strands of research suggest that position sizing is as critical as signal generation, and that practical constraints, parameter calibration, and market frictions ultimately determine whether theoretically attractive sizing rules translate into sustainable performance.