# Estimate the HysTAR model using conditional least squares estimation

Source:`R/hystar_fit.R`

`hystar_fit.Rd`

Estimate the parameters of the hysteretic threshold autoregressive (HysTAR) model.

## Arguments

- data
a vector, matrix or data.frame containing the outcome variable \(y\) in the first column and the threshold variable \(z\) in the second. Other columns are ignored. A vector, is taken to be both the outcome and control variable, so, in that case, a self-exciting HysTAR is fitted.

- r
A vector or a matrix with search values for \(\hat{r}_0, \hat{r}_1\). Defaults to

`c(.1, .9)`

.A vector

`r`

must contain two values \(a\) and \(b\) in \([0, 1]\). The search space for the thresholds will be observed values of`z`

between its \(100a\%\) and \(100b\%\) percentiles.A matrix

`r`

allows for a custom search. It must have two columns, such that each row represents a pair \(r_0 \le r_1\) to test. You can use a matrix with one row if you don't want to estimate the thresholds. Note that the values in these matrix should be on the scale of`z`

.

- d
A numeric vector with one or more values for the search space of the delay parameter. Defaults to 1. Typically, d is not very large, so a reasonable search space might be 0, 1, 2, ..., 5.

- p0
A numeric vector with one or more values for the search space of the autoregressive order of Regime 0. Defaults to 1.

- p1
Same as

`p0`

, but for regime 1. Note that it does not need to be equal to`p0`

.- p_select
The information criterion that should be minimized to select the orders \(p_0\) and \(p_1\). Choices:

`"bic"`

(default, Bayesian Information Criterion)`"aic"`

(Akaike Information Criterion)`"aicc"`

(Corrected Akaike Information Criterion)`"aiccp"`

(Change-point Akaike Information Criterion)

- thin
`TRUE`

(default) or`FALSE`

. Only relevant when`r`

is a vector.If

`TRUE`

(default), the search space for the thresholds are the \(100a\%, 100(a+0.01)\%, \dots, 100b\%\) percentiles of`z`

. This drastically reduces computation costs while keeping a reasonably large search space for the thresholds. Note that this is a purely practical choice with no theoretical justification.If

`FALSE`

, all observed unique values of`z`

between the \(100a\%\) and \(100b\%\) percentiles of`z`

will be considered.

- tar
`TRUE`

or`FALSE`

(default). Choose`TRUE`

if you want to fit a traditional 2-regime threshold autoregressive (TAR) model. In this model, there is only one threshold (or equivalently, a HysTAR model with \(r_0 = r_1\)).- show_progress
`TRUE`

or`FALSE`

(default). Do you want to be updated on the progress of the estimation algorithm? This can be desirable when the number of time points, or the search space of`d`

,`p0`

or`p1`

, are large.

## Value

An object of S3 class `hystar_fit`

, which is a `list`

containing the following
items:

`$data`

. A`data.frame`

containing`y`

, the outcome variable`z`

, the threshold variable`H`

, a logical vector that indicates at which time points the hysteresis effect is happening. Note that this vector starts with`NA`

(s), since not all values can be predicted in the HysTAR model. See Details.`R`

, the regime indicator vector. (Also starts with`NA`

(s).)

`$residuals`

. Also accessible with the`residuals()`

S3 method.`$coefficients`

, a vector with the estimated coefficients. With the`coef()`

S3 method, the coefficients are represented in a matrix. Use the`confint()`

method to get the confidence intervals of the estimates.`$delay`

, a scalar with the estimate for the delay parameter.`$thresholds`

, a vector with the estimates of the thresholds.`$orders`

, a vector with the estimates of the orders.`$resvar`

, a vector with the estimates of the residual variances.`$rss`

, the minimized residual sum of squares.`$ic`

, a vector with the aic, the corrected aic and the bic.`$n`

, a vector with the total effective observations and the effective obeservations in regime 0 and regime 1.`$eff`

, a vector with the time indicators of the effective observations.`$equiv`

, a matrix containing equivalent estimates for the delay and thresholds, i.e., estimates that imply exactly the same regime indicator vector, and as a result the same minimal residual sum of squares.`$r_search`

, a vector with the \(r\)-values that were considered.`$tar`

, Logical:`TRUE`

if a TAR model was fitted.

Implemented generics for the `hystar_fit`

class:

`plot()`

plots the`z`

variable and the`y`

variable above one another. Shading of the background visualizes the regimes. Thresholds are drawn as horizontal lines in the`z`

plot. You can provide regime_names (char vector of 2), main (char vector of 1), xlab (char vector of 1) and ylab (char vector of 2).`summary()`

, this also provides the p-values and standard errors for the estimates of the coefficients.`print()`

prints the estimates within the mathematical representation of the model. Note that the scalar multiplied with`e[t]`

is the standard deviation of the residuals,*not*the variance. See also the model definition above.

## Details

In regime 0, \(y_{t}\) is predicted by values up to \(y_{t - p_0}\).
This implies that the first \(p_0\) time points can not be predicted. E.g.,
if \(p_0 = 2\), \(y_1\) would miss a value from \(y_{-1}\). Similarly, the
value of the delay parameter implies that the regime is unknown for the first
\(d\) time points. To ensure that the same data are used on all options for
`d`

, `p0`

and `p1`

, the first `max(d, p0, p1)`

observations are discarded for
estimation of the parameters.

## The HysTAR model

The HysTAR model is defined as:

\( y_t = \begin{cases} \phi_{00} + \phi_{01} y_{t-1} + \cdots + \phi_{0 p_0} y_{t-p_0} + \sigma_{0} \epsilon_{t} \quad \mathrm{if}~R_{t} = 0 \\ \phi_{10} + \phi_{11} y_{t-1} + \cdots + \phi_{1 p_1} y_{t-p_1} + \sigma_{1} \epsilon_{t} \quad \mathrm{if}~R_{t} = 1, \\ \end{cases} \)

with \( R_t = \begin{cases} 0 \quad \quad \mathrm{if} \, z_{t-d} \in (-\infty, r_{0}] \\ R_{t-1} \quad \mathrm{if} \, z_{t-d} \in (r_0, r_1] \\ 1 \quad \quad \mathrm{if} \, z_{t-d} \in (r_1, \infty), \\ \end{cases} \)

where \(p_j\) denotes the order of regime \(j \in \{0,1\}\) with coefficients \(\phi_{j0}, \dots, \phi_{j p_j \in (-1, 1)}\), \(\sigma_{j}\) is the standard deviation of the residuals, and \(d \in \{0, 1, 2, \dots\}\) is a delay parameter. The parameters of primary interest are the thresholds \(r_0 \le r_1\). We let \(t = 0, 1, 2, ..., T\), where \(T\) is the number of observations.

## References

Li, Guodong, Bo Guan, Wai Keung Li, en Philip L. H. Yu. ‘Hysteretic Autoregressive Time Series Models’. Biometrika 102, nr. 3 (september 2015): 717–23.

Zhu, Ke, Philip L H Yu, en Wai Keung Li. ‘Testing for the Buffered Autoregressive Process’. Munich Personal RePEc Archive, (november 2013).

## Examples

```
simulated_control_variable <- z_sim()
simulated_hystar_model <- hystar_sim(simulated_control_variable)
fitted_hystar_model <- hystar_fit(simulated_hystar_model$data)
```