SGPE ECNM11049

Outline of today’s lab

The dynamic factor model (DFM)

Packages we will use in this lab

dfms: Efficient estimation of dynamic factor models (DFMs) using the expectation maximization (EM) algorithm or two-step (2S) estimation.
xts: Similar to the zoo package, it includes important functions for working with time series.
httr: To download data directly into R from webpages.

# Important packages described above
install.packages("dfms", repos = "https://cran.rstudio.com/", 
    dependencies = TRUE)
install.packages("xts", repos = "https://cran.rstudio.com/", 
    dependencies = TRUE)
install.packages("httr", repos = "https://cran.rstudio.com/", 
    dependencies = TRUE)

The next step is to make sure that you can access the routines in this package by making use of the library command, which would need to be run regardless of the machine that you are using.

# Important packages described above
library(dfms)
library(xts)
library(httr)
library(dplyr)

Part 0: Import data into R

In this computer tutorial we directly source the current data vintage from the FRED webpage. We use the GET() function from the httr package. To do so, we need the url of the data file, which we temporarily download/save and then write on an object named usmacro.

url_fred <- "https://www.stlouisfed.org/-/media/project/frbstl/stlouisfed/research/fred-md/monthly/current.csv?sc_lang=en&hash=80445D12401C59CF716410F3F7863B64" # Specify URL of data file. 
GET(url_fred, write_disk(tf <- tempfile(fileext = "usdat.csv")))  # Temporarily save the csv file

usmacro <- read.table(tf, sep=",",header=T) # 

# Select the following macroeconomic variables 
 var.slct <- c("INDPRO", "IPFPNSS", "IPFINAL", "IPCONGD", "IPDCONGD", "IPNCONGD", "IPBUSEQ","IPMAT", "IPDMAT", "IPNMAT", # 10 industrial production series
              "CPIAUCSL", "CPIAPPSL",  "CPITRNSL",  "CPIULFSL",  "PCEPI" # 5 price series
              )
var.plot <- var.slct  # Select the following macroeconomic variables for the plot


# var.slct <- c("RPI", "INDPRO", "PAYEMS", "S.P.500")
usmacro <- usmacro[,var.slct]
usmacro <- ts(usmacro[-1,], start = c(1959, 1), frequency = 12)
usmacro <- window(usmacro, start = c(1970, 1), end = c(2023, 12))

ts.plot(scale(usmacro[,var.plot]), lwd = 1, main = "US macro time series")

You will see the data is not stationary, so first take the log difference

usgrowth = usmacro %>%
  log() %>%   # take log
  diff() %>%  # take the difference
  scale()     # standardise data

ts.plot(usgrowth[,var.plot], lwd = 1, main = "Growth rates of US macro time series")
abline(h = 0)

Part I: Brief idea about the structure of the model

Before estimating a model, the ICr() function can be applied to determine the number of factors. It computes three information criteria proposed in Bai and NG (2002), whereby the second criteria generally suggests the most parsimonious model

ic = ICr(usgrowth)
print(ic)
plot(ic)

Another option is to use a scree plot (see previous lab) to gauge the number of factors by looking for a kink in the plot. A mathematical procedure for finding the kink was suggested by Onatski (2010), but this is currently not implemented in ICr().

screeplot(ic)

Based on the scree plot, I gauge that a model with 3 or 4 factors should be estimated. Next to the number of factors, the lag order of the factor-VAR of the transition equation should be estimated (the default is 1 lag). This can be done using the VARselect() function from the vars package, with PCA factor estimates reported by ICr()

vars::VARselect(ic$F_pca[, 1:4]) # Using 3 PCs to estimate the VAR lag order

## $selection
## AIC(n)  HQ(n)  SC(n) FPE(n) 
##      4      2      1      4 
## 
## $criteria
##                1         2         3         4         5         6         7
## AIC(n)  2.571221  2.473769  2.430330  2.405649  2.416996  2.429584  2.442200
## HQ(n)   2.625546  2.571555  2.571576  2.590355  2.645162  2.701210  2.757287
## SC(n)   2.711151  2.725643  2.794148  2.881411  3.004702  3.129234  3.253794
## FPE(n) 13.081803 11.867181 11.362891 11.086184 11.213204 11.355990 11.501212
##                8         9        10
## AIC(n)  2.459526  2.467761  2.414020
## HQ(n)   2.818072  2.869767  2.859486
## SC(n)   3.383064  3.503243  3.561445
## FPE(n) 11.703611 11.802181 11.186776

The selection thus suggests we should estimate a factor model with r = 4 factors and p = 4 lags. (In Gary’s slides, the number of factors is represented by m.)

Part II: DFM estimation

The DFM can be written as

\[\begin{aligned} y_t &= C f_t + \varepsilon_t,\\ f_t &= A_1 f_{t-1} + A_2 f_{t-2} + \cdots + A_p f_{t-p} + u_t, \quad \varepsilon_t \sim \mathcal{N}(0, W),\\ \varepsilon_t &= \rho_1 \varepsilon_{t-1} + \cdots + \rho_q \varepsilon_{t-q} +e_t, \quad e_t \sim \mathcal{N}(0, V). \end{aligned}\]

Estimation can be done using the DFM() function with parameters r and p

model1 = DFM(usgrowth, 
             r = 4,           # Number of factors 
             p = 4,           # Number of lags
             idio.ar1 = FALSE,# Model observation errors as AR(1) processes?
             max.iter = 500
             )

## Converged after 27 iterations.

summary(model1)

## Dynamic Factor Model: n = 15, T = 647, r = 4, p = 4, %NA = 0
## 
## Call:  DFM(X = usgrowth, r = 4, p = 4, idio.ar1 = FALSE, max.iter = 500)
## 
## Summary Statistics of Factors [F]
##       N  Mean   Median      SD       Min      Max
## f1  647     0   0.1589  2.7767   -41.792  18.2906
## f2  647     0  -0.2256  1.8624  -11.8058   6.7822
## f3  647     0   0.0017  1.0412   -5.7385   3.2917
## f4  647    -0   0.0218  0.9574   -5.6904   4.9577
## 
## Factor Transition Matrix [A]
##      L1.f1   L1.f2   L1.f3    L1.f4    L2.f1    L2.f2    L2.f3    L2.f4
## f1 0.31051 0.30261  0.1415 -0.28354 -0.12621 -0.24208 -0.01003  0.03743
## f2 0.08357 0.55209 -0.1091  0.13570  0.02156 -0.03037  0.13159  0.04575
## f3 0.09928 0.02507 -0.2109 -0.05436  0.03370 -0.06323 -0.04600 -0.06917
## f4 0.02839 0.01643  0.1152 -0.18811  0.00325 -0.05274 -0.02714 -0.22540
##        L3.f1    L3.f2    L3.f3     L3.f4     L4.f1      L4.f2    L4.f3    L4.f4
## f1 0.0617040  0.07189 -0.09753  0.252646 0.0316171 -0.1007523  0.10761  0.16607
## f2 0.0007484  0.05427 -0.05512  0.105254 0.0065051  0.1795951 -0.17631 -0.01718
## f3 0.0209634 -0.06858  0.03420 -0.004894 0.0360414  0.0130395  0.05148 -0.03445
## f4 0.0294371  0.03741  0.02433  0.006286 0.0002602 -0.0002518 -0.03791 -0.09391
## 
## Factor Covariance Matrix [cov(F)]
##           f1        f2        f3        f4
## f1   7.7101   -0.0039    0.0195   -0.0068 
## f2  -0.0039    3.4686    0.0121   -0.0345 
## f3   0.0195    0.0121    1.0842    0.0292 
## f4  -0.0068   -0.0345    0.0292    0.9166 
## 
## Factor Transition Error Covariance Matrix [Q]
##         u1      u2      u3      u4
## u1  6.6713 -0.3468 -0.2119 -0.1359
## u2 -0.3468  1.8341  0.0107 -0.0332
## u3 -0.2119  0.0107  0.9168 -0.0109
## u4 -0.1359 -0.0332 -0.0109  0.8298
## 
## Observation Matrix [C]
##              f1      f2      f3      f4
## INDPRO   0.3510 -0.0772  0.1371  0.0821
## IPFPNSS  0.3526 -0.0707 -0.1355 -0.0696
## IPFINAL  0.3449 -0.0762 -0.1836 -0.0824
## IPCONGD  0.3163 -0.0967 -0.3191  0.0639
## IPDCONGD 0.3026 -0.0721 -0.0304 -0.3473
## IPNCONGD 0.2089 -0.0722 -0.5666  0.5609
## IPBUSEQ  0.3174 -0.0493  0.0331 -0.2816
## IPMAT    0.3073 -0.0749  0.4174  0.2380
## IPDMAT   0.3093 -0.0588  0.2262 -0.1312
## IPNMAT   0.2060 -0.0308  0.3376  0.3093
## CPIAUCSL 0.0916  0.5046 -0.0366  0.0419
## CPIAPPSL 0.1141  0.1851  0.0642 -0.0105
## CPITRNSL 0.1158  0.3875  0.0214  0.0134
## CPIULFSL 0.0955  0.4936 -0.0367  0.0404
## PCEPI    0.0843  0.4775 -0.0635  0.0246
## 
## Observation Error Covariance Matrix [diag(R) - Restricted]
##   INDPRO  IPFPNSS  IPFINAL  IPCONGD IPDCONGD IPNCONGD  IPBUSEQ    IPMAT 
##   0.0002   0.0005   0.0206   0.0843   0.1614   0.0157   0.1385   0.0007 
##   IPDMAT   IPNMAT CPIAUCSL CPIAPPSL CPITRNSL CPIULFSL    PCEPI 
##   0.1768   0.4484   0.0337   0.7715   0.3638   0.0667   0.1354 
## 
## Observation Residual Covariance Matrix [cov(resid(DFM))]
##             INDPRO   IPFPNSS   IPFINAL   IPCONGD  IPDCONGD  IPNCONGD   IPBUSEQ
## INDPRO     0.0000   -0.0001*  -0.0002*  -0.0001    0.0000    0.0000   -0.0002 
## IPFPNSS   -0.0001*   0.0002   -0.0003*  -0.0009*  -0.0014*   0.0000    0.0000 
## IPFINAL   -0.0002*  -0.0003*   0.0202    0.0184*   0.0238*   0.0002    0.0166*
## IPCONGD   -0.0001   -0.0009*   0.0184*   0.0830    0.1044*  -0.0022*  -0.0609*
## IPDCONGD   0.0000   -0.0014*   0.0238*   0.1044*   0.1589   -0.0019*  -0.0695*
## IPNCONGD   0.0000    0.0000    0.0002   -0.0022*  -0.0019*   0.0008    0.0045*
## IPBUSEQ   -0.0002    0.0000    0.0166*  -0.0609*  -0.0695*   0.0045*   0.1366 
## IPMAT     -0.0001*   0.0002*   0.0007*   0.0005*   0.0004    0.0000    0.0004 
## IPDMAT    -0.0004*   0.0006*  -0.0093*   0.0103*   0.0074    0.0034*  -0.0319*
## IPNMAT     0.0000    0.0013*  -0.0274*  -0.0257*  -0.0325*  -0.0037*  -0.0092 
## CPIAUCSL   0.0000   -0.0001    0.0006   -0.0017   -0.0018    0.0001    0.0029 
## CPIAPPSL  -0.0005*   0.0006    0.0084    0.0135    0.0217    0.0005    0.0061 
## CPITRNSL   0.0002   -0.0002   -0.0107*   0.0210*   0.0291*   0.0007   -0.0467*
## CPIULFSL   0.0000   -0.0002    0.0006    0.0108*   0.0152*  -0.0001   -0.0107*
## PCEPI     -0.0001    0.0002    0.0010   -0.0026   -0.0066   -0.0004    0.0075 
##              IPMAT    IPDMAT    IPNMAT  CPIAUCSL  CPIAPPSL  CPITRNSL  CPIULFSL
## INDPRO    -0.0001*  -0.0004*   0.0000    0.0000   -0.0005*   0.0002    0.0000 
## IPFPNSS    0.0002*   0.0006*   0.0013*  -0.0001    0.0006   -0.0002   -0.0002 
## IPFINAL    0.0007*  -0.0093*  -0.0274*   0.0006    0.0084   -0.0107*   0.0006 
## IPCONGD    0.0005*   0.0103*  -0.0257*  -0.0017    0.0135    0.0210*   0.0108*
## IPDCONGD   0.0004    0.0074   -0.0325*  -0.0018    0.0217    0.0291*   0.0152*
## IPNCONGD   0.0000    0.0034*  -0.0037*   0.0001    0.0005    0.0007   -0.0001 
## IPBUSEQ    0.0004   -0.0319*  -0.0092    0.0029    0.0061   -0.0467*  -0.0107*
## IPMAT      0.0003    0.0007*  -0.0006   -0.0001    0.0012*  -0.0005   -0.0001 
## IPDMAT     0.0007*   0.1757   -0.0679*   0.0017   -0.0346*   0.0467*   0.0037 
## IPNMAT    -0.0006   -0.0679*   0.4483   -0.0046   -0.0080   -0.0049   -0.0023 
## CPIAUCSL  -0.0001    0.0017   -0.0046    0.0157   -0.0149*  -0.0214*  -0.0178*
## CPIAPPSL   0.0012*  -0.0346*  -0.0080   -0.0149*   0.7698   -0.1022*  -0.0046 
## CPITRNSL  -0.0005    0.0467*  -0.0049   -0.0214*  -0.1022*   0.3533    0.0439*
## CPIULFSL  -0.0001    0.0037   -0.0023   -0.0178*  -0.0046    0.0439*   0.0495 
## PCEPI      0.0003   -0.0151*   0.0193*  -0.0114*   0.0408*  -0.0577*  -0.0327*
##              PCEPI
## INDPRO    -0.0001 
## IPFPNSS    0.0002 
## IPFINAL    0.0010 
## IPCONGD   -0.0026 
## IPDCONGD  -0.0066 
## IPNCONGD  -0.0004 
## IPBUSEQ    0.0075 
## IPMAT      0.0003 
## IPDMAT    -0.0151*
## IPNMAT     0.0193*
## CPIAUCSL  -0.0114*
## CPIAPPSL   0.0408*
## CPITRNSL  -0.0577*
## CPIULFSL  -0.0327*
## PCEPI      0.1196 
## 
## Residual AR(1) Serial Correlation
##    INDPRO   IPFPNSS   IPFINAL   IPCONGD  IPDCONGD  IPNCONGD   IPBUSEQ     IPMAT 
## -0.004101  0.012939 -0.009475  0.113489  0.194212  0.103491  0.153206  0.011628 
##    IPDMAT    IPNMAT  CPIAUCSL  CPIAPPSL  CPITRNSL  CPIULFSL     PCEPI 
##  0.112309 -0.002949 -0.073317  0.090848  0.440820  0.102576  0.296435 
## 
## Summary of Residual AR(1) Serial Correlations
##    N    Mean  Median      SD      Min     Max
##   15  0.1028  0.1026  0.1325  -0.0733  0.4408
## 
## Goodness of Fit: R-Squared
##   INDPRO  IPFPNSS  IPFINAL  IPCONGD IPDCONGD IPNCONGD  IPBUSEQ    IPMAT 
##   1.0000   0.9998   0.9798   0.9170   0.8411   0.9992   0.8634   0.9997 
##   IPDMAT   IPNMAT CPIAUCSL CPIAPPSL CPITRNSL CPIULFSL    PCEPI 
##   0.8243   0.5517   0.9843   0.2302   0.6467   0.9505   0.8804 
## 
## Summary of Individual R-Squared's
##    N    Mean  Median      SD     Min  Max
##   15  0.8445   0.917  0.2163  0.2302    1

plot(model1)

It also gives you the log likelihood until convergence

ll <- model1$loglik
cat("LL", ll[length(ll)])

## LL -4616.225

If I provide the output table, could you write down the equation and the estimated parameters?

Can you experiment with different values for p, q, and r, and compare their information criteria?

SGPE ECNM11049 — ATSE

Dynamic factor models

Ping Wu (ping.wu@strath.ac.uk)

Outline of today’s lab

Packages we will use in this lab

Part 0: Import data into R

Part I: Brief idea about the structure of the model

Part II: DFM estimation