Exploring the Factor Zoo With a Machine-Learning Portfolio

Uncovering key factor patterns in stock return prediction with ML.

In partnership with

Hi! Here's Iván with this week's exciting newsletter, brimming with insights and discoveries on building robust investment strategies and risk models using Machine Learning.

In this edition, I am presenting the following sections:

  • 🕹️ AI-Finance Insights: I summarize two must-read academic papers that mix cutting-edge ML/DL with Asset Pricing & Quant Finance:

    • Exploring the Factor Zoo With a Machine-Learning Portfolio

    • Identifying Factors via Automatic Debiased Machine Learning

  • 💊 AI Essentials: The section on top AI & Quant Finance learning resources: Today, I’m introducing the Artificial Intelligence Full Course 2024 by Simplilearn. This in-depth, 7-hour video covers essential AI and machine learning concepts, making it a perfect resource for mastering AI fundamentals, from neural networks to real-world applications.

  • 🥐 Asset Pricing Insights: In this edition, I introduce a great paper that examines the question: How Global is ML Predictability? Discover how global models outperform local ones in stock return prediction, offering a powerful new approach to optimize asset pricing strategies across different markets.

Today’s Sponsor

Trade Smarter with these Free, Daily Stock Alerts

It’s never too late to learn how to master the stock market.

You’ll receive daily trade alerts sent directly to your phone and email detailing the hottest stock picks.

The best part? There’s no cost to join!

Expert insights will be at your fingertips instantly.

“Exploring the Factor Zoo With a Machine-Learning Portfolio”

📢 Uncovering key patterns in stock return prediction with ML.

This paper presents a novel approach to understanding the factor zoo by combining ML and portfolio analysis to predict stock returns with greater accuracy.

The main contributions and findings are as follows:

👉 Machine Learning Portfolio Construction: The paper uses ML models trained on 106 characteristics to predict stock returns and reverse-engineer the important, yet often unobservable, characteristics driving returns.

👉 Significant Alpha Generation: The ML portfolio outperforms entrenched factor models like Fama-French (FF3, FF5) and Carhart (C4), generating an annualized alpha (αml) between 17.16% and 29.76%, highlighting its superiority in forecasting stock returns.

👉 Dominant Factors Identified: Despite training on 106 characteristics, only two small subsets (3 to 4 characteristics) alternate in dominance within the ML portfolio. These subsets serve as proxies for investor arbitrage constraints and firm financial constraints, revealing key drivers behind stock return performance.

👉 Robust Performance Across Models: Even when benchmarked against traditional factor models and mimicking portfolios, the ML portfolio maintains significant positive alpha. It consistently outperforms, driven by time-varying portfolio weights in dominant characteristics.

“Identifying Factors via Automatic Debiased Machine Learning”

📢 How to Find Useful Factors from the Factor Zoo. Keep reading! 🔻

This paper introduces a breakthrough approach using Automatic Debiased Machine Learning (ADML) to robustly identify risk factors that drive cross-sectional asset returns, resolving bias and overfitting issues found in traditional ML methods.

The main contributions and findings are as follows:

👉 Machine Learning for Factor Identification: The ADML method robustly estimates the partial pricing effect of each factor, controlling for over 150 confounding factors under a nonlinear Stochastic Discount Factor (SDF) model, identifying about 30 to 50 significant factors from the factor zoo.

👉 Superior Factor Identification: ADML uncovers factors like net debt finance, operating leverage, and 36-month momentum as key drivers of returns, outperforming both Fama-French models and other factor selection methods like LASSO, especially under nonlinear market conditions.

👉 Robust Performance in Asset Pricing: ADML significantly reduces estimation bias and provides asymptotically normal results, identifying more significant factors (30-50) compared to traditional linear models that typically identify fewer than 15.

👉 International Robustness: ADML proves effective beyond U.S. markets, identifying sentiment-driven factors in China’s stock market, highlighting its potential across diverse global markets.

👉 Spanning Test Success: ADML-based models outperform Fama-French models in factor explanation, especially excelling in categories like intangibles, demonstrating ADML’s superior ability to explain risk factors in asset pricing.

Sponsor 👇

The Biggest Disruption to $martphones Since iPhone

Over the last 3 years, Mode has seen 32,481% Growth, making them one of America’s fastest growing companies. Mode is on a mission to disrupt the entire industry with their “EarnPhone”, a budget smartphone that’s helped consumers earn and save $325M for activities like listening to music, playing games, and… even charging their devices?!

Mode has over $60M in revenue - this is your chance to invest in a $1T+ market opportunity!

This is a paid advertisement for Mode Mobile Reg A offering. Please read the offering statement at https://invest.modemobile.com/.

AI-Essentials

In this video, you'll dive into the Artificial Intelligence Full Course 2024 by Simplilearn. Covering over 7 hours of content, this comprehensive course explores AI and machine learning in depth, providing a step-by-step guide to mastering these cutting-edge technologies.

Asset Pricing Insights

“The Power of Financial Transfer Learning”

📢 How Global is ML Predictability? Discover how global models outperform local ones in stock return prediction. Keep reading! 🔻

This paper introduces a breakthrough in using financial transfer learning to predict stock returns more effectively by utilizing global models, even in the absence of local data.

The main contributions and findings are as follows:

👉 The paper demonstrates that a common global model predicts stock returns more accurately than models estimated individually for each country. The global model shows 94% accuracy across countries and time, suggesting a high degree of global predictability in asset pricing.

👉 The authors develop a "generalized elastic net" (GENet), which efficiently transfers global information to local countries while also detecting unique local elements. The GENet model significantly enhances return prediction accuracy compared to purely local or global models.

👉 The global model achieves a much higher out-of-sample R-squared and better Sharpe ratios compared to local models. Even when trained on U.S. data alone, the global model outperforms country-specific models, including in countries with limited data.

👉 The predictive power of the global model remains stable over the past century, with the predictive function being 63% correlated across five distinct time periods, confirming the consistency of global factors in asset pricing.

👉 While global models dominate, local effects do exist but add only modest improvement. The global component makes up approximately 94% of the predictive power, leaving little room for purely local predictability.

If you're enjoying our newsletter and want to support us, please recommend it to anyone you know who's interested in AI and Finance. Your referrals are the biggest compliment and help us grow! 🌟🤖💼