Predicting interest rate swap spreads behind the linear regression ECM and the yield curve, via machine learning

  • Luís Miguel Ribeiro Teixeira (Student)

Student thesis: Master's Thesis


This dissertation aims to forecast US 10-year Interest Rate Swap spreads out of sample using the last 20 years of data, which encompass significant events such as the 2008 financial crisis, the puzzle of negative spreads, liquidity shortages, and the COVID-19 pandemic. This dissertation shifts from the traditional theory-driven approach to swap spreads, taking a statistical perspective aligned with investment banking practices that prioritize model performance and forecasting accuracy. Drawing on the work of Kobor et al. (2005) and Cortez (2003), it extends their linear regression Error Correction Model (ECM) to machine learning algorithms (Lasso, XGBoost, and Decision Tree Regressor), covering a wider time frame, and integrating new features to capture variations behind Treasury Supply-related features. Main findings reveal a cointegration between U.S. dollar swap spreads and the supply of U.S. Treasury bonds, supporting prior evidence, while short-term deviations from the trend are associated with factors such as the AA spread, the repo rate, and the TED spread, but also to news data, sentiment, and uncertainty features. Another surprising key factor appears to be cointegrated with U.S. dollar swap spreads: the Google trend search for the term ‘Interest rate swap’. Machine Learning models outperformed the linear regression ECM in predicting swap spreads, underscoring their potential in financial applications.
Date of Award22 Jan 2024
Original languageEnglish
Awarding Institution
  • Universidade Católica Portuguesa
SupervisorDan Tran (Supervisor)


  • Swap spreads
  • Lasso
  • Decision tree regressor
  • XGBoost
  • Machine learning


  • Mestrado em Finanças

Cite this