Comparative Statistical Analysis of Stock Prices: Delta Airlines vs. American Airlines
📈

Comparative Statistical Analysis of Stock Prices: Delta Airlines vs. American Airlines

Date
May 8, 2024
Tags
R
Statistics
Data Science
Data Analysis

Description

This project aims to conduct a comprehensive statistical analysis comparing the stock price performance of Delta Airlines (DAL) and American Airlines (AAL) over a specified time period. The analysis explores trends, volatility, and key performance indicators to provide insights into the financial health and market behavior of both companies. The analysis utilized multiple statistical techniques to explore the dynamics of the daily closing prices and log returns of these stocks.
Data Source: Yahoo Finance
Analysis Period: January 10, 2019 – February 7, 2024
Stocks Analyzed: NYSE(Delta Airlines) and NASDAQ(American Airlines)

Methodology

  • Data Collection and Preparation
    • Stock data was retrieved using the quantmod library in R.
    • The adjusted daily closing prices were extracted and transformed to calculate log returns, ensuring a stationary dataset for analysis.
  • Exploratory Data Analysis (EDA)
    • Time series plots of daily closing prices revealed non-stationarity, likely influenced by significant events such as the COVID-19 pandemic. Seasonal patterns and fluctuations in volatility were observed over time.
    • Log return series plots were more stationary and centered around zero, with fluctuations largely within the range of -0.1 to 0.1.
  • Density Estimation
    • Kernel Density Estimates (KDE), normal density estimates, and Student-t density estimates were computed for the log returns of both stocks.
    • The Student-t distribution better captured the heavy tails observed in the log-return data compared to the normal distribution.
  • Logistic Regression Analysis
    • A logistic regression model was used to predict the direction of daily stock returns (up or down) based on lagged returns and trading volume.
    • The model demonstrated limited predictive accuracy (approximately 49.61% for DAL and 54.83% for AAL), suggesting the need for further feature engineering or advanced modeling techniques.
  • Linear Discriminant Analysis (LDA)
    • LDA was applied as an alternative to logistic regression for directional prediction.
    • Performance was comparable, with slight improvements in accuracy for Delta Airlines (50%).
  • Correlation and Copula Analysis
    • Pearson and Kendall’s tau correlation coefficients were calculated, showing a strong positive correlation between the log returns of DAL and AAL (approximately 0.86).
    • Copula models, including t-Copula, Normal, Clayton, and Joe Copulas, were fitted to explore the dependence structure. The t-Copula provided the most nuanced representation of tail dependence.

Results and Findings

  • Stock Behavior and Volatility:
    • Both stocks exhibited notable volatility during the COVID-19 pandemic. Delta Airlines showed a recovery trend post-2020, while American Airlines returned to a downward trajectory after an initial correction.
  • Distribution Characteristics:
    • Log returns demonstrated heavy-tailed behavior, with the Student-t distribution providing a better fit than the normal distribution.
  • Directional Predictive Models:
    • Logistic regression and LDA showed limited accuracy, likely due to the inherent noise and high randomness in daily stock movements.
  • Dependence Structure:
    • High correlation was observed between the stocks, with a detailed dependence captured through the t-Copula.

Visualization and Interpretation

  • Time series and log-return plots visualized stock performance and stability over time.
Interpretation
The two plots of the daily-closing-price series of both DAL and AAL stocks do not look stationary. The most noticeable movement is the large downward movement of both stock prices around the first quarter of 2020, which seems to be an outlier. Based on general knowledge of that year, the downward movement could be caused by the COVID-19 pandemic. The time series plots in general may suggest the presence of seasonality since there are some visible patterns at each quarter of the years observed. The daily-closing-prices for both stocks seem to be higher during the beginning and middle of the year, with lower daily-closing-prices otherwise. In addition, the time series plot of DAL seems to be in an upward movement before January 2020 and have steadily corrected itself after the large downward movement during the first quarter of 2020.
On the other hand, the AAL’s time series plot shows a downward movement up to the first quarter of 2020 and began its correction after. Although there was a huge upward movement during the correction at the end of 2020 to 2021, the daily-closing-price of AAL seems to return to its downward movement and steadily remain at the price range of the initial correction. The volatility of both plots seem to fluctuate over time.
The two series of “log returns” are more stationary than the previous daily-closing-price time series plots, since they eliminate the non-stationary properties of the dataset, resulting to a more stable financial data. When the DAL log returns and AAL log returns are plotted on top of each other in a single plot, they seem to have similar movements throughout the time period. The log returns of both stocks seem to fluctuate around zero without exhibiting a clear trend, with most of the movements for both stocks fall between the values 0.1 and -0.1, which indicates lower volatility. This could also suggest that the stocks tend to return to their long-term average price over time. The magnitude of the log returns does not show drastic price changes, except for the first half of 2020. The large magnitude of movement of the log returns may indicate potential outliers which could represent significant events or anomalies in the price movements of both stocks. Based on the common knowledge of events in 2020, these outliers could be the result of the COVID-19 pandemic.
notion image
notion image
  • Density plots highlighted the heavy-tailed nature of stock returns.
Interpretation
The Student-t distribution seems to fit both the DAL logdiff data AAL logdiff data better than the normal distribution based on the shape of the kernel density estimate (KDE). DAL’s KDE exhibits distinct features, notably with the left tail extending to approximately -0.3 and the right tail stretching up to around 0.2, while AAL’s KDE stretches from -0.3 and a little over 0.3. Additionally, there is a noticeable small bulge on the right side of the DAL KDE before it flattens. These observations may suggest that the data possesses characteristics such as outliers or higher variability, which necessitate a distribution capable of accommodating heavier tails. Both KDEs exhibit a characteristic of being “skinny” and “tall”, where they have pronounced peaks and relatively narrow spread. This suggests that the data may be concentrated around a central value which indicates a higher density of observations in a narrower range compared to a distribution with a broader spread. While the normal distribution assumes symmetrical tails and may not fully capture the extended tail behavior observed in the data, the Student-t distribution offers a more flexible fit. With its ability to model heavier tails, the Student-t distribution aligns more closely with the empirical distribution represented by the KDEs. This flexibility allows the Student-t distribution to better capture the nuances of the data which provides a more accurate representation compared to the normal distribution.
notion image
notion image
  • Scatterplots of log returns illustrated the linear relationship between the two stocks.
Interpretation
The data points seem to cluster around the area where the values of the DALdifflog fall between -0.1 and 0.1 and the values of the AALdifflog fall between -0.1 and 0.1. The rest of the data points can be found around this cluster and merely a few data points seem to scatter far from the cluster such as the point on the bottom left hand side of the plot. These data points that deviate far from the cluster could potentially be outliers since it is not following the general pattern of the scatterplot. Based on the shape and direction formed by the data points, the two stocks, DAL and AAL, seem to have a positive linear relationship. Although the cluster is present, the data points appear to form a diagonal straight line that trends upwards from the bottom left to the top right of the plot.
notion image
  • Copula contour plots provided insights into tail dependencies and overall correlation structure.
Interpretation
Since the t-copula is more flexible in capturing heavy-tailed distributions, the diagram display density estimates with fatter tails and more nuanced dependence structures. The Normal copula assumes linear dependence and normality in the marginal distributions with its diagram displaying symmetric and elliptical contours with moderate tail behavior. The Clayton copula emphasizes lower tail dependence; therefore, it exhibits steeper contours in the lower tail region of the density estimate. The Joe copula on the other hand emphasizes upper tail dependence; therefore, it exhibits steeper contours in the upper tail region. The KDE captures non-parametric features in the data, highlighting deviations from the parametric assumptions of the copula models. Differences among the pairs of diagrams can be seen in regions where the copula assumptions are violated or where non-linear dependence exists. For instance, the KDE shows higher density in regions where the data exhibit heavy tails and non-normality compared to the parametric estimates. Similarities can be seen in regions where the copula assumptions hold, providing a point of reference for evaluating the adequacy of the chosen copula models.
notion image
notion image
notion image
notion image

Conclusion

This analysis demonstrated the statistical properties of Delta Airlines and American Airlines stocks over a five-year period. While basic predictive models showed limited success, the strong correlation between the stocks and the heavy-tailed nature of their returns were effectively captured through advanced statistical techniques such as copula modeling. Future analyses could incorporate machine learning approaches for improved predictive accuracy.
This project exemplifies the application of statistical tools in financial data analysis and highlights the challenges of modeling stock price movements.
 

👋🏻 Let’s chat!

If you are interested in working with me, have a project in mind, or just want to say hello, please don't hesitate to contact me.

Find me here 👇🏻

notion image
Please do not steal my work. It took uncountable cups of coffee and sleepless nights. Thank you.