BLOG

Behind the numbers.

Code examples, market analysis, and data quality deep-dives.

Is MSTR a Leveraged Bitcoin Proxy? Rolling Beta Analysis in Python
Is Micron's Memory Cycle Recovering? Inventory and Margin Forecasting in Python
Which Sectors Work When Bonds Rally? Rate-Sensitive Rotation in Python
Do One-Month Price Extremes Reverse? Signal Evaluation in Python
Do Low-Volatility S&P 500 Stocks Reduce Drawdowns? Factor Test in Python
Is AI Capex Paying Back Fast Enough? Revenue Hurdle Forecasting in Python
Could Shorter AI Asset Lives Hit Earnings? Depreciation Stress Test in Python
How Much AI Capex Risk Can a Portfolio Remove? Constrained Optimization in Python
Is the AI Capex Trade Crowded? Rolling Volatility and Sector Rotation in Python
Did the AI Boom Come From Existing S&P 500 Members? Point-in-Time Momentum Test in Python
Is AI Revenue Circular? Customer-Vendor Capex Loop Analysis in Python
Is the AI Trade Connected to Private Credit? Rolling Correlation Network in Python
Is Apollo More Balance-Sheet Sensitive Than Peers? Leverage Screen in Python
Are AI Earnings Supported by Cash Flow? Accrual and Capex Screen in Python
Can Defensive Stocks Hedge AI Drawdowns? Basket Regime Test in Python
How Fast Does the Market Price In Fed Decisions? FOMC Event Study in Python
How Much Are Options Sellers Overpaid? The Variance Risk Premium in Python
Which Companies Have the Worst Earnings Quality? Sloan Accrual Screen with Geographic Revenue Data in Python
Does the Oil-to-Gold Ratio Signal Recessions? XLE/GLD Backtest in Python
Is AI Spending Crowding Out Free Cash Flow? Capex Sustainability Across the Mag 7 in Python
Does a Long Energy / Short Bonds Portfolio Capture Inflation Surprises? Factor Construction in Python
Can a Hidden Markov Model Detect Oil Market Regimes? HMM Analysis in Python
Do Grain Prices Predict Food Inflation? Granger Causality Test in Python
Does the Corporate Credit Spread Predict Stock Market Crashes? BAA-AAA Spread Analysis in Python
Do Oil Stocks Hedge Inflation? Rolling Beta Analysis in Python
Which Stocks Are Most Rate-Sensitive? Equity Duration via Bond Beta in Python
Which Companies Have the Highest Accrual Ratios? Earnings Quality Screening in Python
Is Alpha Persistent or Decaying? Rolling Sharpe Ratio Analysis in Python
Are Markets Trending or Mean-Reverting? Hurst Exponent Analysis in Python
Is Consumer Discretionary vs Staples a Leading Indicator? XLY/XLP Ratio Analysis in Python
Does Heavy Capex Predict Future Stock Returns? Capital Expenditure Analysis in Python
How to Estimate Cost of Equity Using CAPM in Python
Is Volatility Predictable? Testing for Volatility Clustering in Python
Which Industrials Are Overleveraged? Net Debt to EBITDA Screening in Python
GM Before and After Bankruptcy: Why Entity Resolution Matters for Financial Data
What Is Adjusted Beta? Merrill Lynch Beta Shrinkage in Python
How Good Is a Stock Pick? Information Ratio and Tracking Error in Python
Do Stock Returns Follow a Normal Distribution? Testing for Fat Tails in Python
Which Large Caps Have the Highest Free Cash Flow Yield? FCF Screening in Python
Which Sectors Won Over 5 Years? Sector Rotation Analysis in Python
How to Forecast Stock Volatility with GARCH Models in Python
Are Stock Prices Mean-Reverting? Augmented Dickey-Fuller Test in Python
How to Calculate CAPM Alpha and Beta with Regression in Python
How to Compare Sector Sharpe Ratios and Sortino Ratios in Python
DELL: Why Stitching Historical Price Data Together Is Wrong
How to Analyze Drawdown and Recovery for Bank Stocks in Python
How to Screen SaaS Stocks by Revenue Growth and Cash Flow in Python
How to Screen REITs by Dividend Yield and Valuation in Python
How Correlated Are the Magnificent 7? Intra-Group Correlation in Python
AAPL vs XOM: Do Individual Stocks Have Seasonal Patterns?
How to Rank Large-Cap Stocks by Momentum in Python
How to Build a Multi-Endpoint Financial Dashboard in Python
How to Compare Volatility Across Energy Stocks in Python
How to Screen Healthcare Stocks by Valuation in Python
How to Build a Sector Correlation Matrix for Portfolio Diversification in Python
How to Find Oversold and Overbought Stocks Using Z-Scores in Python
How to Measure Earnings Quality: Cash Flow vs Net Income in Python
How to Build a Multi-Factor Stock Screen in Python (Value + Momentum + Quality)
How to Build a Simple DCF Model for Any Stock in Python
How to Screen Tech Stocks by Revenue Growth in Python
How to Screen Stocks by Balance Sheet Health in Python
Is "Sell in May" Real? SPY Monthly Seasonality Over 10 Years
How to Compare Sector Performance YTD Using Python
How to Track S&P 500 Additions and Removals Over Time in Python
How to Screen Dividend Stocks by Yield and Quality in Python
How to Calculate Max Drawdown and Recovery Time for Any Stock in Python
How to Compare Profitability Across Mega-Cap Tech Stocks in Python
Why Ticker Symbols Are Unreliable: The Recycling Problem Every Quant Should Know
How to Calculate and Compare Stock Volatility in Python
How to Screen Blue-Chip Stocks by P/E Ratio in Python
How to Track Companies Through Ticker Changes, Bankruptcies, and Renames in Python
S&P 500 Turnover: How Much the Index Has Changed Since 2010
How to Calculate Stock Beta and Correlation in Python
← All articles

How to Build a Multi-Factor Stock Screen in Python (Value + Momentum + Quality)

What’s the question?

Can combining multiple investment factors into a single composite score produce better stock selection than relying on any single factor alone? Single-factor screens — ranking stocks by price-to-earnings ratio, or by trailing returns, or by profitability — each capture one dimension of a stock’s attractiveness. But a stock that is cheap may be cheap for a reason (declining business), and a stock with strong momentum may be overvalued. Multi-factor models, widely used by quantitative hedge funds since the early 1990s, attempt to identify companies that score well across multiple independent dimensions simultaneously.

The approach

We construct a three-factor composite score using value, momentum, and quality:

  1. Value is measured by earnings yield (the inverse of the price-to-earnings ratio), where a higher yield indicates a cheaper stock relative to its earnings.
  2. Momentum is the compounded 6-month total return, capturing the persistence of recent price trends.
  3. Quality is measured by return on equity (ROE), defined as net income divided by shareholders’ equity, which quantifies how effectively a company converts equity capital into profit.

Each factor is z-score normalized across the universe of 15 stocks (subtract the mean, divide by the standard deviation) so that all three factors are on the same scale regardless of their native units. The composite score is the weighted average of the three z-scores, with approximately equal weighting: 33% value, 34% momentum, 33% quality. Stocks are ranked by the composite score from highest (most attractive across all three dimensions) to lowest.

import xfinlink as xfl
import pandas as pd
import numpy as np

xfl.set_api_key("your_key")  # free at https://xfinlink.com/signup

tickers = [
    "AAPL", "MSFT", "NVDA", "AMZN", "META", "GOOGL",
    "JPM", "JNJ", "XOM", "PG", "HD", "COST", "UNH", "LLY", "ABBV",
]

metrics = xfl.metrics(tickers, period_type="annual",
                      fields=["earnings_yield", "roe"], period="3y")
val = metrics.sort_values("period_end").groupby("ticker").tail(1)[["ticker", "earnings_yield", "roe"]].set_index("ticker")

prices = xfl.prices(tickers, period="6mo", fields=["return_daily"])
mom = prices.sort_values("date").groupby("ticker")["return_daily"].apply(
    lambda x: (1 + x).prod() - 1
).rename("momentum_6mo")

combined = val.join(mom, how="inner").dropna()
for col in ["earnings_yield", "momentum_6mo", "roe"]:
    combined[f"{col}_z"] = (combined[col] - combined[col].mean()) / combined[col].std()

combined["composite_score"] = (
    combined["earnings_yield_z"] * 0.33 + combined["momentum_6mo_z"] * 0.34 + combined["roe_z"] * 0.33
)
combined = combined.sort_values("composite_score", ascending=False)

print("=== 3-Factor Composite: Value + Momentum + Quality ===")
for ticker, r in combined.iterrows():
    print(f"  {ticker:5s}  yield={r['earnings_yield']:.3f}  mom6m={r['momentum_6mo']:>+6.1%}  roe={r['roe']:.3f}  score={r['composite_score']:>+5.2f}")
print(f"\nTop: {combined.index[0]} ({combined.iloc[0]['composite_score']:+.2f})")
print(f"Bottom: {combined.index[-1]} ({combined.iloc[-1]['composite_score']:+.2f})")

Output:

=== 3-Factor Composite: Value + Momentum + Quality ===

  XOM    yield=0.046  mom6m=+26.6%  roe=0.111  score=+0.73
  JNJ    yield=0.050  mom6m=+17.5%  roe=0.329  score=+0.69
  AAPL   yield=0.026  mom6m= +8.6%  roe=1.519  score=+0.58
  JPM    yield=0.071  mom6m= -5.3%  roe=0.157  score=+0.42
  UNH    yield=0.035  mom6m=+19.5%  roe=0.120  score=+0.29
  NVDA   yield=0.023  mom6m=+10.2%  roe=0.763  score=+0.14
  HD     yield=0.046  mom6m=-15.9%  roe=1.105  score=+0.12
  PG     yield=0.048  mom6m= -1.5%  roe=0.306  score=+0.11
  META   yield=0.046  mom6m= -5.2%  roe=0.278  score=-0.03
  LLY    yield=0.023  mom6m= +0.0%  roe=0.778  score=-0.12
  AMZN   yield=0.027  mom6m= +8.3%  roe=0.189  score=-0.13
  GOOG   yield=0.028  mom6m= +0.5%  roe=0.318  score=-0.24
  COST   yield=0.018  mom6m= +9.2%  roe=0.278  score=-0.24
  MSFT   yield=0.033  mom6m=-18.4%  roe=0.296  score=-0.65
  ABBV   yield=0.012  mom6m= -7.3%  roe=-1.292  score=-1.67

Top: XOM (+0.73)
Bottom: ABBV (-1.67)

What this tells us

XOM ranks first not because it leads any individual factor but because it scores above average on all three: reasonable earnings yield (0.046), the strongest 6-month momentum (+26.6%), and positive ROE. This illustrates the core principle of multi-factor investing — identifying stocks that are consistently above average rather than extreme on one dimension.

Two important caveats emerge from the data. First, AAPL’s ROE of 1.519 (152%) and HD’s ROE of 1.105 (111%) are mathematically correct but economically misleading. Both companies have aggressively repurchased shares, reducing their book equity to very small amounts. When net income of $112B (AAPL) is divided by book equity of approximately $62B, the resulting ROE is inflated by capital structure decisions rather than operating performance. A more robust quality metric for companies with buyback-depleted equity would be return on invested capital (ROIC), which uses total capital (debt plus equity) as the denominator.

Second, ABBV’s last-place composite score (-1.67) is driven almost entirely by its negative ROE of -1.292, which results from negative shareholders’ equity following the Allergan acquisition. This is an accounting artifact of acquisition accounting, not an indication of operating quality. MSFT scores poorly because its 6-month momentum is the weakest in the group at -18.4%, dragging its composite score down despite adequate value and quality metrics.

So what?

Multi-factor screens are a starting point for analysis, not a finished investment process. The model presented here uses equal weighting and a small universe — both limitations that a production implementation would address. More critically, the ROE-based quality factor breaks down for companies with negative or near-zero book equity, which is increasingly common among large-cap companies with aggressive buyback programs. Practitioners should consider alternative quality metrics (ROIC, operating margin stability, accrual ratios) and test whether the composite score has predictive power out of sample before allocating capital based on the rankings.

Built with xfinlink — free financial data API for Python. pip install xfinlink
← All articles