Estimate the HysTAR model using conditional least squares estimation
Source:R/hystar_fit.R
hystar_fit.Rd
Estimate the parameters of the hysteretic threshold autoregressive (HysTAR) model.
Arguments
- data
a vector, matrix or data.frame containing the outcome variable \(y\) in the first column and the threshold variable \(z\) in the second. Other columns are ignored. A vector, is taken to be both the outcome and control variable, so, in that case, a self-exciting HysTAR is fitted.
- r
A vector or a matrix with search values for \(\hat{r}_0, \hat{r}_1\). Defaults to
c(.1, .9)
.A vector
r
must contain two values \(a\) and \(b\) in \([0, 1]\). The search space for the thresholds will be observed values ofz
between its \(100a\%\) and \(100b\%\) percentiles.A matrix
r
allows for a custom search. It must have two columns, such that each row represents a pair \(r_0 \le r_1\) to test. You can use a matrix with one row if you don't want to estimate the thresholds. Note that the values in these matrix should be on the scale ofz
.
- d
A numeric vector with one or more values for the search space of the delay parameter. Defaults to 1. Typically, d is not very large, so a reasonable search space might be 0, 1, 2, ..., 5.
- p0
A numeric vector with one or more values for the search space of the autoregressive order of Regime 0. Defaults to 1.
- p1
Same as
p0
, but for regime 1. Note that it does not need to be equal top0
.- p_select
The information criterion that should be minimized to select the orders \(p_0\) and \(p_1\). Choices:
"bic"
(default, Bayesian Information Criterion)"aic"
(Akaike Information Criterion)"aicc"
(Corrected Akaike Information Criterion)"aiccp"
(Change-point Akaike Information Criterion)
- thin
TRUE
(default) orFALSE
. Only relevant whenr
is a vector.If
TRUE
(default), the search space for the thresholds are the \(100a\%, 100(a+0.01)\%, \dots, 100b\%\) percentiles ofz
. This drastically reduces computation costs while keeping a reasonably large search space for the thresholds. Note that this is a purely practical choice with no theoretical justification.If
FALSE
, all observed unique values ofz
between the \(100a\%\) and \(100b\%\) percentiles ofz
will be considered.
- tar
TRUE
orFALSE
(default). ChooseTRUE
if you want to fit a traditional 2-regime threshold autoregressive (TAR) model. In this model, there is only one threshold (or equivalently, a HysTAR model with \(r_0 = r_1\)).- show_progress
TRUE
orFALSE
(default). Do you want to be updated on the progress of the estimation algorithm? This can be desirable when the number of time points, or the search space ofd
,p0
orp1
, are large.
Value
An object of S3 class hystar_fit
, which is a list
containing the following
items:
$data
. Adata.frame
containingy
, the outcome variablez
, the threshold variableH
, a logical vector that indicates at which time points the hysteresis effect is happening. Note that this vector starts withNA
(s), since not all values can be predicted in the HysTAR model. See Details.R
, the regime indicator vector. (Also starts withNA
(s).)
$residuals
. Also accessible with theresiduals()
S3 method.$coefficients
, a vector with the estimated coefficients. With thecoef()
S3 method, the coefficients are represented in a matrix. Use theconfint()
method to get the confidence intervals of the estimates.$delay
, a scalar with the estimate for the delay parameter.$thresholds
, a vector with the estimates of the thresholds.$orders
, a vector with the estimates of the orders.$resvar
, a vector with the estimates of the residual variances.$rss
, the minimized residual sum of squares.$ic
, a vector with the aic, the corrected aic and the bic.$n
, a vector with the total effective observations and the effective obeservations in regime 0 and regime 1.$eff
, a vector with the time indicators of the effective observations.$equiv
, a matrix containing equivalent estimates for the delay and thresholds, i.e., estimates that imply exactly the same regime indicator vector, and as a result the same minimal residual sum of squares.$r_search
, a vector with the \(r\)-values that were considered.$tar
, Logical:TRUE
if a TAR model was fitted.
Implemented generics for the hystar_fit
class:
plot()
plots thez
variable and they
variable above one another. Shading of the background visualizes the regimes. Thresholds are drawn as horizontal lines in thez
plot. You can provide regime_names (char vector of 2), main (char vector of 1), xlab (char vector of 1) and ylab (char vector of 2).summary()
, this also provides the p-values and standard errors for the estimates of the coefficients.print()
prints the estimates within the mathematical representation of the model. Note that the scalar multiplied withe[t]
is the standard deviation of the residuals, not the variance. See also the model definition above.
Details
In regime 0, \(y_{t}\) is predicted by values up to \(y_{t - p_0}\).
This implies that the first \(p_0\) time points can not be predicted. E.g.,
if \(p_0 = 2\), \(y_1\) would miss a value from \(y_{-1}\). Similarly, the
value of the delay parameter implies that the regime is unknown for the first
\(d\) time points. To ensure that the same data are used on all options for
d
, p0
and p1
, the first max(d, p0, p1)
observations are discarded for
estimation of the parameters.
The HysTAR model
The HysTAR model is defined as:
\( y_t = \begin{cases} \phi_{00} + \phi_{01} y_{t-1} + \cdots + \phi_{0 p_0} y_{t-p_0} + \sigma_{0} \epsilon_{t} \quad \mathrm{if}~R_{t} = 0 \\ \phi_{10} + \phi_{11} y_{t-1} + \cdots + \phi_{1 p_1} y_{t-p_1} + \sigma_{1} \epsilon_{t} \quad \mathrm{if}~R_{t} = 1, \\ \end{cases} \)
with \( R_t = \begin{cases} 0 \quad \quad \mathrm{if} \, z_{t-d} \in (-\infty, r_{0}] \\ R_{t-1} \quad \mathrm{if} \, z_{t-d} \in (r_0, r_1] \\ 1 \quad \quad \mathrm{if} \, z_{t-d} \in (r_1, \infty), \\ \end{cases} \)
where \(p_j\) denotes the order of regime \(j \in \{0,1\}\) with coefficients \(\phi_{j0}, \dots, \phi_{j p_j \in (-1, 1)}\), \(\sigma_{j}\) is the standard deviation of the residuals, and \(d \in \{0, 1, 2, \dots\}\) is a delay parameter. The parameters of primary interest are the thresholds \(r_0 \le r_1\). We let \(t = 0, 1, 2, ..., T\), where \(T\) is the number of observations.
References
Li, Guodong, Bo Guan, Wai Keung Li, en Philip L. H. Yu. ‘Hysteretic Autoregressive Time Series Models’. Biometrika 102, nr. 3 (september 2015): 717–23.
Zhu, Ke, Philip L H Yu, en Wai Keung Li. ‘Testing for the Buffered Autoregressive Process’. Munich Personal RePEc Archive, (november 2013).
Examples
simulated_control_variable <- z_sim()
simulated_hystar_model <- hystar_sim(simulated_control_variable)
fitted_hystar_model <- hystar_fit(simulated_hystar_model$data)