Basic univariate forecasting
This example shows how to use Prophetverse to perform univariate forecasting with a time series dataset, using sktime-style interface
Because of this compatibility, you can benefit from all the features of sktime, such as hierarchical reconciliation, ensemble models, pipelines, etc. There are two main methods to use Prophetverse with sktime:
-
fit(y, X=None)
: This method is used to fit the model. It takes as input a time seriesy
and an optional exogenous variableX
. They
time series must be apd.Series
or apd.DataFrame
. TheX
variable must be apd.DataFrame
orNone
. -
predict(fh, X=None)
: This method is used to make predictions. It takes as input a forecast horizonfh
and an optional exogenous variableX
. Thefh
forecast horizon can be a relative or an absolute forecast horizon. TheX
variable must be apd.DataFrame
orNone
, according to theX
variable used in thefit
method.
Later in this example, we will also show additional methods to make predictions, such as
predict_quantiles
and predict_components
.
import matplotlib.pyplot as plt
import numpy as np
import pandas as pd
from numpyro import distributions as dist
Import dataset
We import a dataset from Prophet's original repository. We then put it into sktime-friendly format, where the index is a pd.PeriodIndex
and the colums are the time series.
from prophetverse.datasets.loaders import load_peyton_manning
y = load_peyton_manning()
display(y.head())
Output: [3]
y | |
---|---|
ds | |
2007-12-10 | 9.590761 |
2007-12-11 | 8.519590 |
2007-12-12 | 8.183677 |
2007-12-13 | 8.072467 |
2007-12-14 | 7.893572 |
The full dataset looks like this:
Output: [4]
Fit model
Here, we will show how you can fit a simple model with Prophetverse. We first fit a model without seasonal components, and then fit a full model. We also show how easy it is to switch between Maximum A Posteriori (MAP) inference and Markov Chain Monte Carlo (MCMC).
No seasonality
from prophetverse.effects.trend import PiecewiseLinearTrend
from prophetverse.engine import MAPInferenceEngine
from prophetverse.sktime import Prophetverse
from prophetverse.utils import no_input_columns
model = Prophetverse(
trend=PiecewiseLinearTrend(
changepoint_interval=500,
changepoint_prior_scale=0.00001,
changepoint_range=-250,
),
inference_engine=MAPInferenceEngine(),
)
model.fit(y=y)
Output: [5]
Prophetverse(inference_engine=MAPInferenceEngine(), trend=PiecewiseLinearTrend(changepoint_interval=500, changepoint_prior_scale=1e-05, changepoint_range=-250))Please rerun this cell to show the HTML repr or trust the notebook.
Prophetverse(inference_engine=MAPInferenceEngine(), trend=PiecewiseLinearTrend(changepoint_interval=500, changepoint_prior_scale=1e-05, changepoint_range=-250))
PiecewiseLinearTrend(changepoint_interval=500, changepoint_prior_scale=1e-05, changepoint_range=-250)
PiecewiseLinearTrend(changepoint_interval=500, changepoint_prior_scale=1e-05, changepoint_range=-250)
forecast_horizon = pd.period_range("2007-01-01", "2018-01-01", freq="D")
fig, ax = plt.subplots(figsize=(10, 5))
preds = model.predict(fh=forecast_horizon)
preds.plot.line(ax=ax)
ax.scatter(y.index, y, marker="o", color="k", s=2, alpha=0.5)
Output: [6]
<matplotlib.collections.PathCollection at 0x307832c90>
With seasonality
Here, we fit the univariate Prophet and pass an exogenous effect as hyperparameter.
The exogenous_effects
parameter let us add new components to the model and control
the relationship between exogenous variables and the target variable.
In this case, the LinearFourierSeasonality
effect creates sinusoidal and cosine
terms to model the seasonality of the time series, which are then multiplied by
linear coefficients and added to the model.
This argument is a list of tuples of the form (effect_name, effect, regex_to_filter_relevant_columns)
,
where effect_name
is a string and effect
is an instance of a subclass of
prophetverse.effects.BaseEffect
. The regex is used to filter the columns of X
that are relevant for the effect, but can also be None
(or its alias prophetverse.utils.no_input_columns
) if no input in X
is needed
for the effect. For example, the seasonality effect already implemented in
prophetverse.effects
module does not need any input in X
, so we can use
prophetverse.utils.no_input_columns
as the regex.
from prophetverse.effects.fourier import LinearFourierSeasonality
from prophetverse.utils import no_input_columns
model = Prophetverse(
trend=PiecewiseLinearTrend(
changepoint_interval=500,
changepoint_prior_scale=0.00001,
changepoint_range=-500,
),
exogenous_effects=[
(
"seasonality",
LinearFourierSeasonality(
freq="D",
sp_list=[7, 365.25],
fourier_terms_list=[3, 10],
prior_scale=0.1,
effect_mode="multiplicative",
),
no_input_columns,
),
],
inference_engine=MAPInferenceEngine(),
)
model.fit(y=y)
Output: [7]
Prophetverse(exogenous_effects=[('seasonality', LinearFourierSeasonality(effect_mode='multiplicative', fourier_terms_list=[3, 10], freq='D', prior_scale=0.1, sp_list=[7, 365.25]), '^$')], inference_engine=MAPInferenceEngine(), trend=PiecewiseLinearTrend(changepoint_interval=500, changepoint_prior_scale=1e-05, changepoint_range=-500))Please rerun this cell to show the HTML repr or trust the notebook.
Prophetverse(exogenous_effects=[('seasonality', LinearFourierSeasonality(effect_mode='multiplicative', fourier_terms_list=[3, 10], freq='D', prior_scale=0.1, sp_list=[7, 365.25]), '^$')], inference_engine=MAPInferenceEngine(), trend=PiecewiseLinearTrend(changepoint_interval=500, changepoint_prior_scale=1e-05, changepoint_range=-500))
PiecewiseLinearTrend(changepoint_interval=500, changepoint_prior_scale=1e-05, changepoint_range=-500)
PiecewiseLinearTrend(changepoint_interval=500, changepoint_prior_scale=1e-05, changepoint_range=-500)
forecast_horizon = pd.period_range("2007-01-01", "2018-01-01", freq="D")
fig, ax = plt.subplots(figsize=(10, 5))
preds = model.predict(fh=forecast_horizon)
preds.plot.line(ax=ax)
ax.scatter(y.index, y, marker="o", color="k", s=2, alpha=0.5)
Output: [8]
<matplotlib.collections.PathCollection at 0x30fd4d650>
Probabilistic forecasting
We can also make probabilistic forecasts with Prophetverse, in sktime fashion.
The predict_quantiles
method returns the quantiles of the predictive distribution
in a pd.DataFrame
Output: [9]
y | ||
---|---|---|
0.1 | 0.9 | |
2007-01-01 | 8.002126 | 9.309550 |
2007-01-02 | 7.807156 | 9.113895 |
2007-01-03 | 7.741747 | 9.018427 |
2007-01-04 | 7.639927 | 8.998752 |
2007-01-05 | 7.655394 | 9.010941 |
The plot below shows the (0.1, 0.9) quantiles of the predictive distribution
fig, ax = plt.subplots(figsize=(10, 5))
# Plot area between quantiles
ax.fill_between(
quantiles.index.to_timestamp(),
quantiles.iloc[:, 0],
quantiles.iloc[:, -1],
alpha=0.5,
)
ax.scatter(y.index, y, marker="o", color="k", s=2, alpha=1)
Output: [10]
<matplotlib.collections.PathCollection at 0x312f9e750>
Timeseries decomposition
We can easily extract the components of the time series with the predict_components
method. This method, in particular, is not implemented in sktime's BaseForecaster
,
but it is a method of prophetverse.Prophetverse
class.
Output: [11]
mean | obs | seasonality | trend | |
---|---|---|---|---|
2007-01-01 | 8.655674 | 8.669042 | 0.871136 | 7.784539 |
2007-01-02 | 8.457966 | 8.449063 | 0.673427 | 7.784539 |
2007-01-03 | 8.351393 | 8.369492 | 0.566856 | 7.784539 |
2007-01-04 | 8.314424 | 8.322533 | 0.529884 | 7.784539 |
2007-01-05 | 8.329665 | 8.320658 | 0.545126 | 7.784539 |
for column in sites.columns:
fig, ax = plt.subplots(figsize=(8, 2))
ax.plot(sites.index.to_timestamp(), sites[column], label=column)
ax.set_title(column)
fig.show()
Output: [12]
Fitting with MCMC
In the previous examples, we used MAP inference to fit the model. However, we can also
use Markov Chain Monte Carlo (MCMC) to fit the model. To do this, we just need to
change the inference_engine
parameter to MCMCInferenceEngine
. The rest of the code
remains the same.
The set_params
method is used to set the parameters of the model, in sklearn fashion.
from prophetverse.engine import MCMCInferenceEngine
model.set_params(
inference_engine=MCMCInferenceEngine()
)
model.fit(y=y)
Output: [13]
Prophetverse(exogenous_effects=[('seasonality', LinearFourierSeasonality(effect_mode='multiplicative', fourier_terms_list=[3, 10], freq='D', prior_scale=0.1, sp_list=[7, 365.25]), '^$')], inference_engine=MCMCInferenceEngine(), trend=PiecewiseLinearTrend(changepoint_interval=500, changepoint_prior_scale=1e-05, changepoint_range=-500))Please rerun this cell to show the HTML repr or trust the notebook.
Prophetverse(exogenous_effects=[('seasonality', LinearFourierSeasonality(effect_mode='multiplicative', fourier_terms_list=[3, 10], freq='D', prior_scale=0.1, sp_list=[7, 365.25]), '^$')], inference_engine=MCMCInferenceEngine(), trend=PiecewiseLinearTrend(changepoint_interval=500, changepoint_prior_scale=1e-05, changepoint_range=-500))
PiecewiseLinearTrend(changepoint_interval=500, changepoint_prior_scale=1e-05, changepoint_range=-500)
PiecewiseLinearTrend(changepoint_interval=500, changepoint_prior_scale=1e-05, changepoint_range=-500)
quantiles = model.predict_quantiles(fh=forecast_horizon, alpha=[0.75, 0.25])
fig, ax = plt.subplots(figsize=(10, 5))
# Plot area between quantiles
ax.fill_between(
quantiles.index.to_timestamp(),
quantiles.iloc[:, 0],
quantiles.iloc[:, -1],
alpha=0.5,
)
ax.scatter(y.index, y, marker="o", color="k", s=2, alpha=1)
Output: [14]
<matplotlib.collections.PathCollection at 0x31f0d7150>
One interesting feature of MCMC is that it allows us to obtain samples from the posterior distribution of the parameters. In other words, we can also obtain probabilistic forecasts for the TS components.
Output: [15]
mean | obs | seasonality | trend | ||
---|---|---|---|---|---|
sample | |||||
0 | 2007-01-01 | 0.653385 | 0.652849 | -0.077955 | 0.731340 |
2007-01-02 | 0.768552 | 0.754084 | 0.037212 | 0.731340 | |
2007-01-03 | 0.531737 | 0.547930 | -0.199603 | 0.731340 | |
2007-01-04 | 0.618754 | 0.626193 | -0.112586 | 0.731340 | |
2007-01-05 | 0.584927 | 0.510160 | -0.146413 | 0.731340 | |
... | ... | ... | ... | ... | ... |
999 | 2017-12-28 | 0.571078 | 0.543337 | 0.019529 | 0.551549 |
2017-12-29 | 0.573063 | 0.609064 | 0.021576 | 0.551487 | |
2017-12-30 | 0.558815 | 0.562072 | 0.007389 | 0.551426 | |
2017-12-31 | 0.586710 | 0.646573 | 0.035346 | 0.551364 | |
2018-01-01 | 0.608088 | 0.593138 | 0.056786 | 0.551302 |
4019000 rows × 4 columns
Extra: syntax sugar
In Prophetverse, we've implemented the >>
operator, which makes it easier to
set trend
, exogenous_effects
and inference_engine
parameters.
trend = PiecewiseLinearTrend(
changepoint_interval=300,
changepoint_prior_scale=0.0001,
changepoint_range=0.8,
)
exogenous_effects = [
(
"seasonality",
LinearFourierSeasonality(
freq="D",
sp_list=[7, 365.25],
fourier_terms_list=[3, 10],
prior_scale=0.1,
effect_mode="multiplicative",
),
no_input_columns,
),
]
inference_engine = MAPInferenceEngine()
model = Prophetverse() >> trend >> exogenous_effects >> inference_engine
model.fit(y=y)
Output: [16]
Prophetverse(exogenous_effects=[('seasonality', LinearFourierSeasonality(effect_mode='multiplicative', fourier_terms_list=[3, 10], freq='D', prior_scale=0.1, sp_list=[7, 365.25]), '^$')], inference_engine=MAPInferenceEngine(), trend=PiecewiseLinearTrend(changepoint_interval=300, changepoint_prior_scale=0.0001, changepoint_range=0.8))Please rerun this cell to show the HTML repr or trust the notebook.
Prophetverse(exogenous_effects=[('seasonality', LinearFourierSeasonality(effect_mode='multiplicative', fourier_terms_list=[3, 10], freq='D', prior_scale=0.1, sp_list=[7, 365.25]), '^$')], inference_engine=MAPInferenceEngine(), trend=PiecewiseLinearTrend(changepoint_interval=300, changepoint_prior_scale=0.0001, changepoint_range=0.8))
PiecewiseLinearTrend(changepoint_interval=300, changepoint_prior_scale=0.0001, changepoint_range=0.8)
PiecewiseLinearTrend(changepoint_interval=300, changepoint_prior_scale=0.0001, changepoint_range=0.8)
forecast_horizon = pd.period_range("2007-01-01", "2018-01-01", freq="D")
fig, ax = plt.subplots(figsize=(10, 5))
preds = model.predict(fh=forecast_horizon)
preds.plot.line(ax=ax)
ax.scatter(y.index, y, marker="o", color="k", s=2, alpha=0.5)
Output: [17]
<matplotlib.collections.PathCollection at 0x356a01650>