Outline of today’s lab

Packages we will use in this lab

# Important packages described above
install.packages("dplyr", repos = "https://cran.rstudio.com/", 
    dependencies = TRUE)
install.packages("MSwM", repos = "https://cran.rstudio.com/", 
    dependencies = TRUE)

The next step is to make sure that you can access the routines in this package by making use of the library command, which would need to be run regardless of the machine that you are using.

# Important packages described above
library(dplyr)
library(MSwM)

Part 0: Import data into R

In this lab, we will use the monthly data set for the SP500 from 1959 January to 2016 August.

The first thing that we do is clear all variables from the current environment and close all the plots. This is performed with the following commands:

rm(list = ls())
graphics.off()
options(scipen = 9) # Avoid scientific notation

To import your downloaded data into R, you can:

# Allocate the variables to the object "sp500"
if("Lab2data1.csv" %in% list.files()){
  sp500 <- read.table("Lab2data1.csv", sep=",",header=T)
}else{ # file.choose() allows to choose the file interactively
  sp500 <- read.table(file.choose(), sep=",",header=T)
}
# Specify "sp500" as time series object
sp500 <- ts(sp500[,"SP500"], end = c(2016, 8), frequency = 12)

To make sure that the data has been imported correctly, we inspect a plot of the data:

ts.plot(sp500,
        main = "S&P500 Index",     # Title of plot
        ylab = "Index",            # Label of y-axis
        xlab = "Month",            # Label of x-axis
        col  = "blue",             # Define colours
        lty  = "solid",            # Define line types
        lwd  = 2)                  # Define line width)

You will see the index goes up over time (non-stationary), you might want to take the growth rate of the index. The growth rate can be interpred as stock return and can be approximated by log difference

spreturn <- diff(log(sp500))*100   # Taking the difference (we will lose the first observation)
# Specify "spreturn" as a new time series object
spreturn <- ts(spreturn, end = c(2016, 8), frequency = 12)
ts.plot(spreturn,
        main = "S&P500 Returns",   # Title of plot
        ylab = "%",                # Label of y-axis
        xlab = "Month",            # Label of x-axis
        col  = "blue",             # Define colours
        lty  = "solid",            # Define line types
        lwd  = 2)                  # Define line width)
abline(h = 0) # Add horizontal line

Part I: The Threshold autoregressive (TAR) model

The AR(1) model uses only the first lag of the dependent variable (spreturn) as explanatory variable:

\[\begin{aligned} y_t = c + \rho y_{t-1} + \varepsilon_t, \quad \varepsilon_t \sim \mathcal{N}(0, \sigma^{2}). \end{aligned}\]

The following code can estimate the AR(1) model. It will print the estimate of intercept, coefficient, and error variance. Can you match the estimate to the parameters in the above equation?

ar1_reg <- ar.ols(spreturn, order.max = 1,
                  demean = F, 
                  intercept = T) 
ar1_reg
## 
## Call:
## ar.ols(x = spreturn, order.max = 1, demean = F, intercept = T)
## 
## Coefficients:
##      1  
## 0.2435  
## 
## Intercept: 0.4043 (0.133) 
## 
## Order selected 1  sigma^2 estimated as  11.94

To compute the information criteria

T <- length(spreturn)               # Number of observations
residuals <- ar1_reg$resid[3:T]     # Get the residuals
sigma_sq <- var(residuals)          # Compute the estimated variance of the residuals
# Now compute the log likelihood
loglik <- -T / 2 * log(2 * pi) - T / 2 * log(sigma_sq) - sum(residuals^2) / (2 * sigma_sq)
n.para <- 2
r.aic <- (-2 * (loglik)) + 2 * (sum(n.para))
r.bic <- (-2 * (loglik)) + (log(length(spreturn))) * (n.para)
cat("LL", loglik)
## LL -1836.4
cat("AIC", r.aic)
## AIC 3676.801
cat("BIC", r.bic)
## BIC 3685.877

A homoskedastic TAR model is defined as

\[\begin{aligned} y_t=\left\{\begin{array}{l} \alpha_1 + \rho_1 y_{t-1}+\varepsilon_{t}, \quad \varepsilon_{t} \sim \mathcal{N}(0, \sigma^{2}), \quad \text { if } z_t \leq \tau \\ \alpha_2 + \rho_2 y_{t-1}+\varepsilon_{t}, \quad \varepsilon_{t} \sim \mathcal{N}(0, \sigma^{2}), \quad \text { if } z_t>\tau \end{array}\right. \end{aligned}\]

Coefficients are now allowed to change across two regimes, but there are no regime-switches in the error variance. Dummy variables can be used to classify time series observations into two exclusive regimes. Dummy variables are coded as binary variables that only take two values. To allow for different effects across regimes in our model, we define \(D_t\) as our dummy variable and assume that \(D_t = 1\) if \(z_t > 0\) and \(D_t = 0\) otherwise. We will use \(z_t = y_{t-1}\) as our threshold variable. This implies \(D_t\) is observed and deterministic. Here, we will assume that \(D_t\) indicates a “positive return state” (i.e., \(D_t = 1\) if \(y_{t-1} > 0\). Later, we will relax this assumption on \(D_t\) and instead assume that it is unobserved and follows a Markov process of order one.

In the following, this setup constitutes a standard multivariate regression model given by:

\[\begin{equation*} y_t= \alpha_1 + \rho_1 y_{t-1} + D_t(\gamma_2 + \phi_2 y_{t-1}) +\varepsilon_{t}. \end{equation*}\]

In this regression, how can you recover \(\alpha_2\) and \(\rho_2\) with \(\alpha_1\), \(\rho_1\), \(\gamma_2\), and \(\phi_2\)?

First, let’s create the dummy variable \(D_t\) based on a threshold \(\tau = 0\).

tau <- 0              # Define threshold
Dt  <- spreturn # Initialize sp.thresh with spreturn
Dt[spreturn > tau]   <- 1 # D(t) = 1, if y(t) > 0
Dt[spreturn <= tau]  <- 0 # D(t) = 0, if y(t) <= 0

In the following, we create the data matrices for the lm() function and run an OLS regression:

tar.data   <- as.data.frame(cbind(y = spreturn[2:T],     # Dependent variable
                                  yl1 =spreturn[1:T-1],  # First lag of dependent variable
                                  Dl1 = Dt[1:T-1]))      # First lag of dummy variable
homotar <- lm(y ~ yl1*Dl1,  data = tar.data)
summary(homotar)
## 
## Call:
## lm(formula = y ~ yl1 * Dl1, data = tar.data)
## 
## Residuals:
##      Min       1Q   Median       3Q      Max 
## -22.0708  -1.6553   0.2569   2.0740  12.3770 
## 
## Coefficients:
##             Estimate Std. Error t value  Pr(>|t|)    
## (Intercept)  0.78028    0.28215   2.765   0.00584 ** 
## yl1          0.29306    0.07037   4.164 0.0000352 ***
## Dl1         -0.92593    0.40063  -2.311   0.02112 *  
## yl1:Dl1      0.09739    0.11114   0.876   0.38118    
## ---
## Signif. codes:  0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
## 
## Residual standard error: 3.451 on 686 degrees of freedom
## Multiple R-squared:  0.06701,    Adjusted R-squared:  0.06293 
## F-statistic: 16.42 on 3 and 686 DF,  p-value: 0.0000000002566

If I do not give you the information criteria, can you make a decision on whether there is evidence of threshold nonlinearity? If so, which parameters are changing across regimes? The key parameter are those related to the dummy variable \(D_t\). The model output suggests that there are significant differences in the intercept across regimes. However, the AR(1) coefficient is not significantly different, which implies that there is no significant break in this coefficient.

To compute the information criteria, you can do the following:

T <- length(spreturn)               # Number of observations
residuals <- homotar$residuals    # Get the residuals
sigma_sq <- var(residuals)          # Compute the estimated variance of the residuals
# Now compute the log likelihood
loglik <- -T / 2 * log(2 * pi) - T / 2 * log(sigma_sq) - sum(residuals^2) / (2 * sigma_sq)
n.para <- 2
r.aic <- (-2 * (loglik)) + 2 * (sum(n.para))
r.bic <- (-2 * (loglik)) + (log(length(spreturn))) * (n.para)
cat("LL", loglik)
## LL -1833.822
cat("AIC", r.aic)
## AIC 3671.644
cat("BIC", r.bic)
## BIC 3680.72

What is your conclusion when you compare the information criteria of the TAR model with those of the AR(1) model?

A heterokedastic TAR model is defined as

\[\begin{aligned} y_t=\left\{\begin{array}{l} \alpha_1 + \rho_1 y_{t-1}+\varepsilon_{1t}, \quad \varepsilon_{1t} \sim \mathcal{N}(0, \sigma_{1}^{2}), \quad \text { if } z_t \leq \tau \\ \alpha_2 + \rho_2 y_{t-1}+\varepsilon_{2t}, \quad \varepsilon_{2t} \sim \mathcal{N}(0, \sigma_{2}^{2}), \quad \text { if } z_t>\tau \end{array}\right. \end{aligned}\]

We can estimate this model by simply dividing the data into two parts.

tar1.data <- tar.data %>% subset(Dl1 == 0) # The first subset of the data
tar2.data <- tar.data %>% subset(Dl1 == 1) # The second subset of the data

# Estimate first model (baseline)
heterotar1 <- lm(y ~ yl1,  data = tar1.data)
summary(heterotar1)
## 
## Call:
## lm(formula = y ~ yl1, data = tar1.data)
## 
## Residuals:
##      Min       1Q   Median       3Q      Max 
## -22.0708  -2.2148   0.2602   2.4317  12.3770 
## 
## Coefficients:
##             Estimate Std. Error t value Pr(>|t|)    
## (Intercept)  0.78028    0.33435   2.334 0.020339 *  
## yl1          0.29306    0.08339   3.514 0.000516 ***
## ---
## Signif. codes:  0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
## 
## Residual standard error: 4.089 on 272 degrees of freedom
## Multiple R-squared:  0.04343,    Adjusted R-squared:  0.03991 
## F-statistic: 12.35 on 1 and 272 DF,  p-value: 0.0005165
# Estimate second model
heterotar2 <- lm(y ~ yl1,  data = tar2.data)
summary(heterotar2)
## 
## Call:
## lm(formula = y ~ yl1, data = tar2.data)
## 
## Residuals:
##      Min       1Q   Median       3Q      Max 
## -12.3175  -1.3858   0.2551   1.8188  10.9932 
## 
## Coefficients:
##             Estimate Std. Error t value    Pr(>|t|)    
## (Intercept) -0.14566    0.24375  -0.598        0.55    
## yl1          0.39045    0.07372   5.297 0.000000192 ***
## ---
## Signif. codes:  0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
## 
## Residual standard error: 2.957 on 414 degrees of freedom
## Multiple R-squared:  0.06346,    Adjusted R-squared:  0.0612 
## F-statistic: 28.05 on 1 and 414 DF,  p-value: 0.000000192

To get the overall information criteria, you need to take the sum of the two

# The first subset
residuals <- heterotar1$residuals   # Get the residuals
sigma_sq  <- var(residuals)         # Compute the estimated variance of the residuals
T1        <- length(residuals)      # No. of observations in the first regime
loglik1   <- -T1 / 2 * log(2 * pi) - T1 / 2 * log(sigma_sq) - sum(residuals^2) / (2 * sigma_sq)
n.para    <- 2
r.aic1    <- (-2 * (loglik1)) + 2 * (sum(n.para))
r.bic1    <- (-2 * (loglik1)) + (log(T1)) * (n.para)
# The second subset
residuals <- heterotar1$residuals   # Get the residuals
sigma_sq  <- var(residuals)          # Compute the estimated variance of the residuals
T2        <- length(residuals)       # No. of observations in the second regime
loglik2   <- -T2 / 2 * log(2 * pi) - T2 / 2 * log(sigma_sq) - sum(residuals^2) / (2 * sigma_sq)
n.para <- 2
r.aic2 <- (-2 * (loglik2)) + 2 * (sum(n.para))
r.bic2 <- (-2 * (loglik2)) + (log(T2)) * (n.para)
cat("LL", loglik1+loglik2)
## LL -1547.326
cat("AIC", r.aic1+r.aic2)
## AIC 3102.652
cat("BIC", r.bic1+r.bic2)
## BIC 3117.104

Can you tell me which model is best: AR(1), homoskedastic TAR, or heterokedastic TAR?

Part II: The Markov switching model (MS model)

First import data into R. The series we will use for MS is the real GDP growth rate:

if("Lab2data2.csv" %in% list.files()){
  usrgdp <- read.table("Lab2data2.csv", sep=",",header=T)
}else{ # file.choose() allows to choose the file interactively
  usrgdp <- read.table(file.choose(), sep=",",header=T)
}
usrgdp <- ts(usrgdp[,"RGDP_CH"], end = c(2016, 2), frequency = 4)
ts.plot(usrgdp,
        main = "US GDP growth",    # Title of plot
        ylab = "%",            # Label of y-axis
        xlab = "Month",            # Label of x-axis
        col  = "blue",             # Define colours
        lty  = "solid",            # Define line types
        lwd  = 2)                  # Define line width)
abline(h = 0) # Add horizontal line

First, estimate an AR(2) model for US GDP growth:

usrgdp.lags <- embed(usrgdp, 3)             # Create two lags of the variables
usrgdp.lags <- as.data.frame(usrgdp.lags)
colnames(usrgdp.lags) <- c("RGDP", "RGDPl1", "RGDPl2")

# Estimate the AR(2) model 
reg.ar2  <- lm(RGDP ~ RGDPl1 + RGDPl2, data = usrgdp.lags)
summary(reg.ar2)
## 
## Call:
## lm(formula = RGDP ~ RGDPl1 + RGDPl2, data = usrgdp.lags)
## 
## Residuals:
##      Min       1Q   Median       3Q      Max 
## -10.9011  -1.9416  -0.0848   2.0494  15.7837 
## 
## Coefficients:
##             Estimate Std. Error t value      Pr(>|t|)    
## (Intercept)  1.85529    0.31020   5.981 0.00000000696 ***
## RGDPl1       0.33443    0.06032   5.544 0.00000006982 ***
## RGDPl2       0.09589    0.06027   1.591         0.113    
## ---
## Signif. codes:  0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
## 
## Residual standard error: 3.639 on 272 degrees of freedom
## Multiple R-squared:  0.1452, Adjusted R-squared:  0.139 
## F-statistic: 23.11 on 2 and 272 DF,  p-value: 0.0000000005383

Next, we estimate a Markov-switching AR(2) model:

reg.mswm = msmFit(RGDP ~ RGDPl1 + RGDPl2,
                  data = usrgdp.lags,
                  k=2,                         # Number of regimes
                  sw=c(TRUE,TRUE,TRUE,TRUE),   # Intercept, 2 AR coeff, error variance
                  control=list(parallel=FALSE))
summary(reg.mswm)
## Markov Switching Model
## 
## Call: msmFit(object = model, k = k, sw = sw, p = p, control = control)
## 
##        AIC     BIC    logLik
##   1430.389 1485.79 -709.1944
## 
## Coefficients:
## 
## Regime 1 
## ---------
##                Estimate Std. Error t value   Pr(>|t|)    
## (Intercept)(S)   2.0385     0.4913  4.1492 0.00003336 ***
## RGDPl1(S)        0.3582     0.0822  4.3577 0.00001314 ***
## RGDPl2(S)        0.0620     0.0846  0.7329     0.4636    
## ---
## Signif. codes:  0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
## 
## Residual standard error: 4.546805
## Multiple R-squared: 0.1462
## 
## Standardized Residuals:
##          Min           Q1          Med           Q3          Max 
## -10.85339729  -0.64562220  -0.01862763   0.60747036  15.83650327 
## 
## Regime 2 
## ---------
##                Estimate Std. Error t value    Pr(>|t|)    
## (Intercept)(S)   1.6375     0.3701  4.4245 0.000009667 ***
## RGDPl1(S)        0.1766     0.0957  1.8454    0.064979 .  
## RGDPl2(S)        0.2652     0.0837  3.1685    0.001532 ** 
## ---
## Signif. codes:  0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
## 
## Residual standard error: 1.877974
## Multiple R-squared: 0.1695
## 
## Standardized Residuals:
##             Min              Q1             Med              Q3             Max 
## -5.378119921826 -0.284832683165  0.000001464612  0.241296422278  4.311914007330 
## 
## Transition probabilities:
##             Regime 1   Regime 2
## Regime 1 0.991443769 0.01847509
## Regime 2 0.008556231 0.98152491

In the following, we plot the estimated (smoothed) probabilities for each period being in regime 1 and regime 2:

probr1 <- ts(reg.mswm@Fit@smoProb, end = c(2016, 1), frequency = 4)

plot.ts(probr1[,1], 
        main = " Smoothed probabilities for both regimes", 
        col = "blue", 
        xlab = "", 
        ylab = "", 
        lwd = 1.5)
lines(probr1[,2], col = "red")
legend("left", legend = c("Regime 1", "Regime 2"), 
lwd = c(2, 1), col = c("blue", "red"), bty = "n", y.intersp = 1.5)

How would you interpret the two regimes? (low volatility and high volatility?)

Part III - Digression: A time-varying intercept model/unobserved component model

The following example explains in detail the mechanism of the Forward Filtering Backward Sampling (FFBS) algorithm. An unobserved component model can be written in state-space form as follows:

1.) The observation equation: For \(t = 1, \dots, T\), let \(y_t\) denote a scalar observation, which is linked to a single unobserved/latent component \(\mu_t\):

\(y_t = Z_t\mu_t + R_t v_t\), with \(Z_t = 1\) and \(v_t \sim \mathcal{N}(0, 1)\).

\(Z_t\) relates relates observed \(y_t\) to latent \(\mu_t\). In factor models we often call this quantity “loadings”. \(R_t\) is a scalar and defines measurement error variance (given by \(R_t^2\))

2.) The state equation: For simplicity we assume that the states follow a random walk process:

\(\mu_t = \mu_{t-1} + Q_t w_t\), with \(w_t \sim \mathcal{N}(0, 1)\).

\(Q_t\) is a scalar and defines state innovation variance (given by \(Q_t^2\)).

In what follows, we apply this model to US inflation.

if("Lab2data3.csv" %in% list.files()){
  data <- read.table("Lab2data3.csv", sep=",",header=T)
}else{ # file.choose() allows to choose the file interactively
  data <- read.table(file.choose(), sep=",",header=T)
}
data <- ts(data[,"CPIINFL"], end = c(2016, 1), frequency = 4)

The random walk law of motion implies that states evolve gradually and introduces persistence (reasonable assumption for something often labeled as trend inflation). Moreover, we fix \(R_t = 1\) and assume two deterministic regime for \(Q_t\). During tranquil periods we fix \(Q_t = 0.01\), while in recessionary episodes we set \(Q_t = 1\). Trend component is allowed to adapt more quickly during periods of turmoil. This structure of \(Q_t\) basically serves to illustrate the Kalman Gain.

For FFBS we also need to specify appropriate starting values for period \(t = 0\): \(\mu_0 \sim \mathcal{N}(m_0, V_0)\). We simply use the initial three years as a training sample.

# Starting values in period t = 0: 
mu0 <- mean(data[1:12]) # initial state mean
V0 <- var(data[1:12])   # initial state variance

# Set-up for the observation equation:
y <- matrix(data[13:length(data)]) # observations
N <- nrow(y)                       # no. of observations
m <- 1                             # no. of latent states
ZN <- matrix(1, N, m)              # relates observations to latent states (fixed to one)
RN <- matrix(1, N, 1)              # measurement error variance

# Set-up for the state equation:
p <- 1                             # no. of lags in the state equation (fixed to one)

# Regimes for the state innovation variance 
st <- ts(matrix(0,N,1), end =c(2016,1),frequency = 4) # To illustrate the Kalman gain
window(st[,1],start=c(1972,1),end=c(1980,4)) <- 1     # pre-Volcker period (high inflation state)
window(st[,1],start=c(2007,4),end=c(2009,2)) <- 1   # Great recession (deflationary pressure?)
st <- as.numeric(st)
QN <- matrix((1-st)*1e-2 + st*1, N, 1)          # state innovation variance

The FFBS consists of two steps: a filtering step and a smoothing step

Step 1: Filtering step (Kalman filter):

Produces estimates of the mean and variance of \(\mu_t\) (latent states) up period \(t\)

# Preliminaries for filtering step: 
mu.p <- mu0 # the prediction at time t=0 is the initial state
V.p <- V0 # same for the variance
mu.t <- matrix(0,N,m) # store mu.t conditional on information up to time t
V.t <- matrix(0,N,m^2) # same for variances (contains m^2 elements as a vector)
KG.t <- matrix(0,N,1) # store Kalman Gain (only for illustration)

### Step 1: Filtering step (Kalman filter):
for (t in 1:N){
  # Step 1: Prediction steps
  Rt <- RN[t,] # Extract period-specific measurement error variance
  Qt <- QN[t,] # Extract period-specific state innovation variance
  Zt <- ZN[t,,drop = F] # Extract period-specific "loadings"
  
  cfe <- y[t] - Zt%*%mu.p   # Conditional forecast error (y_t - Z_t mu_t|t-1)
  vfe <- Zt%*%V.p%*%t(Zt) + Rt   # Variance of the conditional forecast error
  inv_f <- t(Zt)%*%solve(vfe)
  
  # Step 2: Updating steps
  KG.t[t] <- V.p%*%inv_f  # the Kalman gain
  mu.tt <- mu.p + V.p%*%inv_f%*%cfe  # update mean estimate for mu_t|t 
  V.tt <- V.p - V.p%*%inv_f%*%Zt%*%V.p # updated variance estimate for mu_t|t
  if (t < N){
    mu.p <- mu.tt
    V.p <- V.tt + Qt
  }
  mu.t[t,] <- as.numeric(mu.tt)
  V.t[t,] <- as.numeric(V.tt)
}

Contrast the Kalman gain with the filtered estimates:

KG.plot <- ts(cbind(KG.t, mu.t), end = c(2016, 1), frequency = 4)
colnames(KG.plot) <- c("KG", "Filtered estimate")
plot.ts(KG.plot, col = c("black"), xlab = "Period", main = "Illustratation of Kalman gain", lwd = 2)

Step 2: Smoothing step (Kalman smoother)

Produces estimates of the mean and variance of the \(\mu_t\) using all information up to period \(T\). Initialize the final value \(\mu_T\) using the moments obtained from the Kalman filters’ terminal state.

mu.est <- matrix(0,N,m) # draw of the time-varying intercept
mu.est[N,] <- mu.t[N,] + rnorm(1, 0, sqrt(V.t[N,]))
  
for (t in 1:(N-1)){
    Qt <- QN[N-t+1,] # Extract period-specific state innovation variance
    mu.f <- mu.est[N-t+1,]
    mu.tt <- t(mu.t[N-t,])
    V.tt <-  V.t[N-t,]
    vfe <- V.tt + Qt
    
    inv_f <- V.tt%*%solve(vfe)
    
    cfe <- mu.f - mu.tt
    mu.mean <- mu.tt + inv_f%*%cfe
    mu.var <- V.tt - inv_f%*%V.tt
    
    mu.est[N-t,] <- mu.mean + rnorm(1, 0, sqrt(mu.var))
}
FFBS.plot <- ts(cbind(mu.est,y), end = c(2016, 1), frequency = 4)
colnames(FFBS.plot) <- c("FFBS estimate", "y")
ts.plot(FFBS.plot, col = c("black", "darkred"), ylab = "Inflation", xlab = "Period", main = "",  lwd = 2)
legend("bottomleft", col = c("black", "darkred"), lty = c(1,1), legend = c("FFBS: a single estimate", expression(y["t"])), lwd =  c(2,2))

We can define a (relatively) general function to perform FFBS:

FFBS_R <- function(y, ZN,RN,QN, m, p, N, B0, V0){
  # Arguments: 
  # y   ... observations (T x K)-matrix
  # ZN   ... "loadings" that link latent states to observations (K x m)-matrix
  # RN  ... measurement error variance (m x m)-matrix
  # QN  ... state innovation variance (m x m)-matrix
  # B0  ... initial state mean (m x 1)-matrix
  # V0  ... initial state variance (m x m)-matrix
  # m   ... no. of latent states
  # p   ... no. of lags in the state equation (fixed to one)
  # N   ... no. of observations
  
  bp <- B0 # the prediction at time t=0 is the initial state
  Vp <- V0 # same for the variance
  bt <- matrix(0,N,m) #Create vector that stores bt conditional on information up to time t
  Vt <- matrix(0,N,m^2) #Same for variances
  bdraw <- matrix(0,N,m)
  
  # Step 1: Kalman filter 
  for (t in 1:N){
    Rt <- RN[t,] 
    if(m > 1) Qt <- diag(QN[t,]) else Qt <- QN[t,]
    Zt <- ZN[t,,drop = F]
    
    cfe <- y[t] - Zt%*%bp   # conditional forecast error
    f <- Zt%*%Vp%*%t(Zt) + Rt    # variance of the conditional forecast error
    inv_f <- try(t(Zt)%*%solve(f), silent = T)
    if(is(inv_f, "try-error")) inv_f <- t(Zt)%*%ginv(f)
    btt <- bp + Vp%*%inv_f%*%cfe  #updated mean estimate for btt Vp * inv_F is the Kalman gain
    Vtt <- Vp - Vp%*%inv_f%*%Zt%*%Vp #updated variance estimate for btt
    if (t < N){
      bp <- btt
      Vp <- Vtt + Qt
    }
    bt[t,] <- t(btt)
    Vt[t,] <- matrix(Vtt,m^2,1)
  }
  
  # Initialize the final value of the state 
  # using the moments obtained from the Kalman filters' terminal state
  
  if(m > 1){
    bdraw.temp <- try(btt+t(chol(Vtt))%*%rnorm(nrow(Vtt)), silent=T)
    if (is(bdraw.temp, "try-error")) bdraw.temp <- mvrnorm(1, btt, Vtt+diag(1e-6,m))
  }else bdraw.temp <- btt + rnorm(1, 0, sqrt(Vtt))
  bdraw[N,] <- bdraw.temp
  
  #Step 2: Kalman smoother
  for (t in 1:(N-1)){
    if(m > 1) Qt <- diag(QN[N-t+1,]) else Qt <- QN[N-t+1,]
    bf <- t(bdraw[N-t+1,])
    btt <- t(bt[N-t,])
    Vtt <- matrix(Vt[N-t,,drop=FALSE],m,m)
    f <- Vtt + Qt
    
    inv_f <- try(Vtt%*%solve(f), silent = T)
    if(is(inv_f, "try-error")) inv_f <- Vtt%*%ginv(f)
    
    cfe <- bf - btt
    bmean <- t(btt) + inv_f%*%t(cfe)
    bvar <- Vtt - inv_f%*%Vtt
    
    bdraw.temp <- try(bmean+t(chol(bvar))%*%rnorm(nrow(bvar)), silent=T)
    if (is(bdraw.temp, "try-error")) bdraw.temp <- mvrnorm(1, bmean, bvar+diag(1e-6,m))
    
    bdraw[N-t,] <- bdraw.temp
  }
  
  return(bdraw)
}

Alternatively use a package (for, example the FKF package):

?FKF::fkf