Conformal prediction is a machine learning framework that provides statistically rigorous uncertainty quantification for predictions, making it particularly valuable in high-risk settings of business decision-making. Unlike traditional prediction models, conformal prediction produces prediction intervals that convey the level of uncertainty, enabling informed decision-making. It is a user-friendly and model-agnostic approach, offering explicit, non-asymptotic guarantees without relying on distributional or model assumptions.

Conformal prediction offers several benefits in machine learning:

**Uncertainty Quantification**: It provides a principled approach to quantify the uncertainty in machine learning predictions, allowing users to assess the confidence level of the predictions.**Distribution-Free**: Unlike Bayesian methods, conformal prediction is distribution-free, making it more robust to model misspecification.**Wide Range of Applications**: Conformal prediction is widely used in various applications, including regression, classification, anomaly detection, and time series forecasting, making it a versatile tool in different domains.**Model-Agnostic Approach**: It can be applied to any pre-trained model, such as random forest, CatBoost, XGBoost, convolutional neural networks, and others, without relying on the underlying model's distributional assumptions.**Statistically Valid Prediction Regions**: Conformal prediction produces prediction regions that contain the true target value with a certain probability, offering a more comprehensive view of the prediction uncertainty compared to traditional point predictions.**User-Defined Significance Level**: Users can set the desired significance level, which controls the size of the prediction region, allowing them to tailor the level of confidence in the predictions based on specific requirements.**Reliable Confidence Measures**: It provides a reliable measure of confidence in predictions, which is particularly valuable in risk-sensitive applications such as medical diagnosis, face recognition, demand forecasting and financial risk prediction.

Conformal prediction intervals help you measure how accurate your prediction is and what level of certainty you have regarding the equality of actual and predicted values.

In their tutorial on Conformal Prediction, Glenn Shafer and Vladimir Vovk state the following definition:

“One of the disadvantages of machine learning as a discipline is the lack of reasonable confidence measure for any given prediction. Conformal prediction utilizes past experience to establish accurate reliable prediction intervals for new predictions. It produces prediction regions that contains true value given an error probability ε.”

In this epi-log post, we provide how conformal prediction work through regression application.

For the regression setting, conformal regression procedure is as follow:

Splitting the data into training and calibration set

Training the model on the training data

Computing non-conformity scores (e.g. MAE: the absolute error between predicted and actual values) and sort the scores in ascending order (from certain to uncertain)

Computing threshold, q (quantile) where (1- ε)*100 % of non-conformity scores are smaller. The threshold is chosen to cover (1- ε)*100 % of the actual values.

Creating prediction intervals for new data

Figure 1: Point predictions are turned into prediction intervals employing conformal prediction (From Christoph Molnar: Introduction To Conformal Prediction With Python)

To illustrate, we present a forecasting application since time series forecasting can be considered as regression with temporal dependencies. However, one of conformal prediction's requirements is the exchangeability assumption. This assumption cannot hold in time series forecasting, as data points have sequential dependencies. Therefore, the exchangeability assumption is relaxed for forecasting applications.

We applied an ML model, LightGBM, to historical electricity demand data in Turkey from 2023 and forecast future demand for the next 30 days. We calculated non-conformity scores for each forecasting horizon using a backtesting procedure with 250 windows, corresponding to a calibration set with 250 data points for each horizon. Our forecast for one day ahead indicates a demand of 864,549 MWh, with the model's one-step-ahead forecast uncertainty, represented by conformal prediction intervals, suggesting that the true electricity consumption will fall within the range of 838,402 to 890,695 MWh with an 80% coverage probability. In other words, 80% of the time, the conformal prediction interval encompasses the actual value of electricity demand on the forecasted day. The line graph in Figure 2 illustrates the effectiveness of conformal prediction intervals in capturing the majority of daily true demand values for the forecasted time horizon.

Figure 2: Prediction intervals for electricity consumption forecasts of 30-day ahead.

This method provides decision-makers with an 80% probability assurance that the actual electricity consumption will lie within these bounds, thus alleviating concerns regarding the opacity of machine learning models.

The primary objective is to establish the narrowest interval possible while achieving the desired coverage. In this example, the prediction interval coverage probability is 90%, exceeding the targeted coverage ratio of 80%. To elaborate, the conformal prediction intervals with an 80% coverage ratio for the 30-day forecasts encapsulate the true value for 27 out of the 30 days.

In conclusion, Epifai utilizes Conformal Prediction Intervals to bolster decision-making by providing statistically rigorous uncertainty quantification in predictions. In this blog, we explored the key benefits of conformal prediction, including its distribution-free nature, model-agnostic approach, and user-defined significance levels. We demonstrated its application in a time series forecasting context, highlighting its ability to capture sequential dependencies despite relaxing the exchangeability assumption.

We extend our sincere gratitude to Valeriy Manokhin for generously promoting Conformal Prediction on LinkedIn and leading us to explore its benefits. This exploration has allowed us to incorporate Conformal Prediction into our machine learning solutions for our partners.

*Utilized References and Recommended Resources for Conformal Prediction:*

Introduction To Conformal Prediction with Python (Molnar, 2023)

Practical Guide to Applied Conformal Prediction in Python: Learn and apply the best uncertainty frameworks to your industry applications (Manokhin, 2023)

A Tutorial on Conformal Prediction (V. Vovk and G. Shafer, 2008)

__Uncertainty Quantification-1__by Mahdi Torabi Rad (2023)__Uncertainty Quantification-2__by Mahdi Torabi Rad (2023)__Uncertainty Quantification-3__by Mahdi Torabi Rad (2023)__Uncertainty Quantification-4A__by Mahdi Torabi Rad (2023)__Uncertainty Quantification-4B__by Mahdi Torabi Rad (2023)

## Comments