Maximizing Portfolio Predictability with Machine Learning

Applying Machine Learning predictions to optimize stock selection.

Sponsored by

Hi! Here's Iván with this week's exciting newsletter, brimming with insights and discoveries on building robust investment strategies and risk models using Machine Learning.

In this edition, I am presenting the following sections:

  • 🕹️ AI-Finance Insights: I summarize three must-read academic papers that mix cutting-edge ML/DL with Quant Finance:

    • Maximizing Portfolio Predictability with Machine Learning

    • Supervised Autoencoder MLP for Financial Time Series Forecasting

    • Quantifying Credit Portfolio sensitivity to asset correlations with interpretable generative neural networks

  • 💊 AI Essentials: The section on top AI & Quant Finance learning resources: Today, I'm excited to share a 1-hour video that explains how Large Language Models (LLMs) work! This talk is particularly beneficial for those of you with no prior knowledge of LLMs' technicalities.

  • 🥐 Methodological Insights: In this edition, I recommend a paper that addresses the potential misuse of well-known cross-validation techniques in developing trading strategies.

Today’s Sponsor

How did @TheArbFather make $10,000+ profit in February?

His secret sauce: Arbitrage betting.

Instead of getting bettings tips from slick-talking handicappers, TheArbFather bets on both sides of an outcome to guarantee a risk-free return. An example of this is betting on a game total to be over 224.5 AND under 224.5.

This might sound too good to be true, but the reason this can happen is sportsbooks set their lines for games independently. Sometimes, they make mistakes and there are situations where FanDuel’s odds are different from DraftKings’ odds.

Unless you are a PhD-wielding, Python-coding, Excel-Wizard… finding arbitrage opportunities consistently has been out of reach. Until now…

OddsJam scans millions of odds every second and finds these need-in-a-haystack opportunities that you can bet on to secure a risk-free profit.

AI-Finance Insights

“Maximizing Portfolio Predictability with Machine Learning”

This paper innovates by using machine learning to create highly predictable stock portfolios (MPPs) from the S&P 500, aiming to boost returns by optimizing stock choices and portfolio weights.

👇 My key takeaways from the paper are:

➡ The research introduces a way to construct the most predictably performing stock portfolio, called the maximally predictable portfolio (MPP), from the S&P 500 using machine learning. It specifically leverages the data from machine learning predictions to optimize stock selection.

➡ By applying machine learning models like Elastic Net, Random Forest, and Support Vector Regression, the study tests various approaches to forecasting stock returns and evaluates their effectiveness in outperforming the S&P 500 index across different periods.

➡ A significant part of the study is dedicated to optimizing portfolio weights under realistic constraints, using an efficient algorithm. This optimization is aimed at maximizing the predictability of returns, thereby potentially increasing investment gains.

➡ The paper finds that portfolios designed around the concept of MPP, especially those employing strategies like the Kelly criterion for leveraging predictability, consistently outperform the benchmark index.

➡ Interestingly, the study revisits a foundational concept by Lo and MacKinlay (1997) on portfolio predictability and extends it by incorporating advanced machine learning models to forecast returns and optimize portfolio weights more effectively.

➡ Two types of MPP-based portfolios are highlighted: one that focuses on stocks predicted to have the highest returns and another that combines the MPP with risk-free assets for optimized reward-risk timing. Both approaches show superior performance compared to the broader market index.

“Supervised Autoencoder MLP for Financial Time Series Forecasting” 

It is uncommon to see papers using intraday data in academia; this is an exception with an interesting approach to the use of supervised autoencoders.

👇 Key takeaways from the paper are:

➡ The paper shows the improvement of forecasting for investment strategies using supervised autoencoders, focusing on the impact of noise augmentation and triple barrier labeling on risk-adjusted returns. The analysis includes the S&P 500, EUR/USD, and BTC/USD.

➡ Results show that optimal noise levels and bottleneck sizes significantly boost strategy effectiveness, while excessive noise and oversized bottlenecks deteriorate performance, highlighting the importance of precise hyperparameter adjustments.

➡ The research introduces a new optimization metric developed for use with triple barrier labeling, which improves classifier performance beyond simple direction classifications, suggesting a refined approach to algorithmic investment strategies that utilize high-frequency data.

➡ The findings have important implications for financial institutions and regulators, suggesting that the adoption of these methods could improve market stability and protect investors, while also encouraging more knowledgeable and strategic investment decisions across various financial sectors.

➡ The methodology shifts from using traditional daily price models to minute-by-minute data, improving the accuracy and relevance of investment signals. This approach is supported by empirical testing and sensitivity analysis, which confirm the robustness of the results.

“Quantifying Credit Portfolio sensitivity to asset correlations with interpretable generative neural networks”

I like to see how ML/DL is beginning to gain a lot of traction in credit risk models. This paper is a great example of that. In my experience, the performance of ML/DL in risk applications is outstanding. Unfortunately, in investing, I'd say the performance is just incrementally significant. 🙂

The paper in a nutshell here 👇

✔ The paper introduces a new method to measure credit portfolio Value-at-Risk (VaR) sensitivity regarding asset correlations, utilizing synthetic financial correlation matrices created by deep learning.

✔Unlike previous methods using Generative Adversarial Networks (GANs) for matrix generation, this research applies Variational Autoencoders (VAE) to produce matrices with a more interpretable latent space.

✔The findings highlight VAE's effectiveness in identifying key diversification factors, especially in how credit portfolio risk sensitivity responds to changes in asset correlations.

AI-Essentials

Discover an engaging 1-hour video that explains how Large Language Models (LLMs) work! This talk is especially beneficial for those of you with no previous knowledge of LLMs' technicalities. Whether you're just starting out and curious about the fundamentals or seeking to grasp the basic concepts, this video is the perfect introduction for you. 👇

Methodological Insights

“Backtest Overfitting in the Machine Learning Era”

Timely paper on the potential misuse of well-known cross-validation techniques in developing trading strategies.

In summary, Combinatorial Purged Cross-Validation (CPCV) appears to be the least flawed method among the analyzed CV approaches in the paper.

Here are the insights from the paper in under 2 minutes: 👇

➡ The study addresses a big need in quantitative finance for better ways to evaluate models and test them outside the sample, especially using cross-validation methods designed for financial markets.

➡ It introduces a detailed framework that deals with the unique features of financial data such as changes over time, connectedness, and sudden shifts, recommending an improved validation approach.

➡ One key finding is the effectiveness of the CPCV method in reducing the risk of fitting models too closely to past data. This method outperforms older methods like K-Fold and Walk-Forward by showing a lower chance of overfitting and a higher reliability measure through its Deflated Sharpe Ratio.

➡ The paper warns to be careful when choosing between Purged K-Fold and K-Fold because they perform similarly and affect the strength of training data in tests outside the sample.

➡ Using a Synthetic Controlled Environment with complex models to mimic market conditions, the study offers new perspectives on evaluating cross-validation techniques.

➡ It emphasizes the urgent need for specialized validation methods in finance, linking theory with practice, and highlights the importance of strong financial modeling in the face of changing market conditions and regulatory requirements.

➡ The paper points out the importance and need for advanced cross-validation techniques like CPCV, improving the trustworthiness and usefulness of financial models in making decisions.

AI won’t take your job, but a person using AI might. That’s why 500,000+ professionals read The Rundown – the free newsletter that keeps you updated on the latest AI news and teaches you how to apply it in just 5 minutes a day.

If you're enjoying our newsletter and want to support us, please recommend it to anyone you know who's interested in AI and Finance. Your referrals are the biggest compliment and help us grow! 🌟🤖💼