Leveraging Python and the Yahoo Finance API to automate historical data retrieval for comparative equity analysis and statistical hypothesis testing.
Python (Pandas, NumPy, Scipy)
yfinance API
Google Colab / Jupyter
Data Cleaning, Statistical Modeling, API Integration
import yfinance as yf
import pandas as pd
tickers = ["DE", "CAT"]
data = yf.download(tickers, start="2023-01-01", end="2025-12-31")
# Extract Adjusted Close for returns
close_data = data['Close'].copy()
close_data.dropna(inplace=True)
# Compute Daily Returns
daily_returns = close_data.pct_change().dropna()
daily_returns.to_csv("daily_returns.csv")
Manually exporting CSVs from financial portals is inefficient for large-scale portfolio management. This script creates a repeatable pipeline that can be scaled from two tickers to an entire index (e.g., S&P 500) with a single variable change.
Using the daily_returns dataset, we can perform a Paired T-Test to determine if the mean daily returns of two stocks (e.g., DE vs CAT) are statistically different.
By calculating the p-value, we can conclude whether the stocks track each other due to market beta or if one significantly outperforms/underperforms the other at a 95% confidence interval.
Beyond simple price action, the yfinance library allows for deep-dive fundamental analysis:
The calculated t-stat of 1.83 falls below the critical value threshold of 1.96 (for a 95% confidence interval). Consequently, we fail to reject the null hypothesis.
This indicates that the difference in mean daily returns between the two assets is not statistically significant. From a quantitative perspective, these stocks are effectively "tracking" each other over the observed period, likely due to shared sector exposure and high correlation.
The statistical confirmation that these returns do not significantly differ provides a green light for a Relative Value / Pairs Trading strategy. Because the returns are statistically tied, any short-term divergence in price is likely an anomaly rather than a fundamental shift.
Execution Logic: