Decoding Crude: Building an Oil Price Forecasting Model with Machine Learning

Written by:

Introduction

Let’s face it: oil prices are a rollercoaster. From geopolitical tensions to economic booms and busts, countless factors send those numbers soaring and plummeting. And trying to predict where they’ll go next? That’s a puzzle that has challenged experts for ages. But what if we could use the power of machine learning to get a clearer view of the road ahead? That’s precisely what we set out to do in our oil price forecasting project. We aimed to build a model that could analyze historical data and give us more accurate predictions, and in this blog post, I’ll walk you through the key steps.

Project Resources

For those wanting to dig deeper, here’s a concise overview of where to find the project’s resources:

Understanding the Fuel: The Dataset

To build a solid forecasting model, you need solid data. We used data from Yahoo Finance and FRED (Federal Reserve Economic Data), giving us a rich dataset with daily price information and other economic indicators.  

We analyzed a range of energy commodities:

  • Brent Crude: A major global benchmark.
  • WTI Crude: The US benchmark.
  • Natural Gas: Essential for heating and power.

Our dataset spanned from 2000 to 2023 (as an example), capturing long-term price fluctuations. Key data included:  

  • OHLCV Data: Open, High, Low, Close, Volume.
  • Technical Indicators: SMA (Simple Moving Average), RSI (Relative Strength Index).
  • Lagged Values: Past prices.

Our Secret Sauce: The Methodology

Our methodology involved:

  1. Feature Engineering: Creating “hints” for the model (e.g., volatility measures, moving averages).  
  2. Machine Learning: Using the XGBoost regressor (a powerful algorithm).  
  3. Evaluation: Rigorous testing to measure prediction accuracy.

A Closer Look at the Results

Here’s a glimpse into our commodity-specific findings:

  • Brent Crude Oil: Long-term price shift (~<span class=”math-inline”>20/barrel in 2000 to ~\$\~40-60 post-2004), major event impact (2008 crisis: ~\$\~140 to ~\~40).  
  • WTI Crude Oil: Similar trends to Brent, but with key differences in spread and volatility.  
  • Natural Gas: High volatility, structural shifts (e.g., “Lower Forever” regime post-2010), weather-driven spikes.  
  • Heating Oil: Strong seasonality (winter premiums), influenced by crude oil and refining costs.  

Model Diagnostics: Checking Under the Hood

We validated the model by analyzing residuals (errors) for patterns, heteroscedasticity (unequal variance), and autocorrelation.

The Big Picture: Project Recap

Mapped: Global Energy Prices, by Country in 2022. This project successfully developed an XGBoost model to forecast oil prices across multiple commodities, demonstrating strong short-term prediction and valuable insights into market dynamics.  

Where to Go Next

Future work could involve building a real-time API, an interactive dashboard, and incorporating more data sources.


Discover more from Junaid Iqbal | Agentic AI Engineer

Subscribe to get the latest posts sent to your email.

Leave a comment