Assignment 3 - solution

.pdf

School

University of Toronto, Scarborough**We aren't endorsed by this school

Course

STATISTICS STAD57

Subject

Statistics

Date

Dec 16, 2024

Pages

Uploaded by MagistrateStarWolverine14

STAD57 Time series analysisAssignment 3 (total points: 100)Due date: Dec 1, 2024 (by 11:59pm).This version: Nov 8, 2024Instructions:Submit a single pdf file containing all work, codes, figures, etc.R is the main language used in the course, but you are also allowed to use Python. Unlessotherwise specified, all plots must be produced by R/Python (i.e., no sketching by hand)and suitably annotated (e.g., axes and labels).1. (10 points) Problem 5.3 of [SS]. Do only the DF and ADF tests.We first plot the data in Figure 1. (Note that the data-set of the package has been updatedso is different from the one shown in the book.) We observe an apparent increasing trendafter year 1950.Timex1850190019502000-0.50.00.51.0Figure 1:Time series of the global temperature series.We first do the Dickey-Fuller test, implemented byfUnitRoots::adfTest()withtype ="nc"(no constant and linear trend). Thep-value is 0.31. So (at any reasonable significancelevel) we fail to reject the null hypothesis that there is a unit root.Then we do the augmented Dickey-Fuller test. We use heretseries::adf.test()withincludes the constant and linear trend and chooses automatically the orderp(for the ARpart). The function choosesp= 5 and thep-value is 0.96. Again, we fail to reject the nullhypothesis that there is a unit root.1

From both tests, it appears that the process is non-stationary and should be differenced.Remark: The sample PACF of the first difference is significant at the first 5 lags. This isconsistent with the lagp= 5 chosen intseries::adf.test().2. (10 points) Problem 5.6 of [SS].We first plot in Figure 2 the raw price series and the log returns. We observe there is somevolatility clustering (e.g., in 2009 and a few terms before 2004).PriceTime20002002200420062008201020406080100120140Log returnTime200020022004200620082010-0.2-0.10.00.10.2Figure 2:Left: Oil price series. Right: Log returns.Then we consider the ACF of the log return and the squared log return in Figure 3. Thereturns themselves are only weakly serially correlated (a few marginally significant values).However, for the squared returns we see that the ACF is positive and significant for manylags.This is clear evidence of GARCH effect (Note: The data is weekly solag = 0.5means 26 weeks.)UsingfGarch::garchFit(), we (first) fit a GARCH(1,1) model where the conditionaldistribution is Student’st-distribution. We report the coefficients below:EstimateStd. Errort value Pr(>|t|)mu4.388e-031.785e-032.4580.0140 *omega1.183e-045.585e-052.1180.0341 *alpha1 7.218e-022.312e-023.1210.0018 **beta18.681e-014.024e-0221.571<2e-16 ***shape9.390e+002.976e+003.1560.0016 **In particular, the shape parameter (of thet-distribution) is 9.39. In Figure 4, we plot (i)ACF of the (standardized) residuals, (ii) ACF of the squared standardized residuals, and(iii) the QQ-plot (based on the estimatedt-distribution). We observe that the ACF of theresiduals at lag 1 is significant. But the ACF of the squared residuals looks good. Alsothe tails of the QQ-plot do not fit on the line (this part is not easy to fix completely evenwith non-normal conditional distributions).2

0.00.10.20.30.40.50.00.20.40.60.81.0LagACFACF of log return0.00.10.20.30.40.50.00.20.40.60.81.0LagACFACF of squared log returnFigure 3:Left: Sample ACF of log return. Sample ACF of squared log return.05101520250.00.20.40.60.81.0LagsACFACF of Standardized Residuals05101520250.00.20.40.60.81.0LagsACFACF of Squared Standardized Residuals-4-2024-4-202qstd - QQ PlotTheoretical QuantilesSample QuantilesFigure 4:Diagnostic plots.To improve this model we fit instead ARMA(0,1)-GARCH (this is better than ARMA(1,0)-GARCH in terms of BIC). Now the ACF of the standardized residual is within the bounds(except at lag 22) while the other two plots look similar. (To save space we don’t showthe plots.)3. (20 points) It is noted in Section 5.3 of [SS] that the likelihood of the ARCH(1) model3

tends to be flat unlessnis large. To investigate this empirically, we consider the data-setrdailyobtained as follows:library(quantmod)start_date <- "2018-01-01"end_date <- "2019-12-31"price <- getSymbols(Symbols = "^FTSE", src = "yahoo",return.class = "xts",from = start_date, to = end_date,periodicity = "daily", auto.assign = FALSE)r_daily <- dailyReturn(price, type = "log")This is the daily log returns of the FTSE index in 2018–2019.Fit an ARCH(1) by implementing directly the conditional log likelihood‘(α0, α1) in (5.46).That is, code the log likelihood directly and use an optimization function likeoptim()tofind the MLE. Compare your estimates with the ones computed from a package (sayfGarch). Produce a surface plot to show‘(α0, α1) as a function ofα0, α1near the MLE,and comment on the flatness (if any) of the optimization landscape.We implement the negative likelihood for the given data (but without the factor12) andoptimize it usingoptim().Using the default options, the estimates are ˆα0≈4.6×10-5and ˆα1≈0.24.We observe that these values are very close to the output offGarch::garchFit().log(α0)α1-4400 -4400 -4200 -4200 -4000 -4000 -3800 -3600 -3400 -3200 -3000 -12-11-10-9-80.10.20.30.40.5α0α10.000000.000100.000200.000300.10.20.30.40.5Figure 5:Optimization landscape. We show here the contour curves. In the left plot,α0is in log-scale. The(location of the) maximum likelihood estimate is indicated by the black dot.We plot in Figure 5 the landscape of the objective function. To do so, we recreate a gridforα0andα1. Since the value of ˆα0is very close to zero, the grid forα0is created bya equally spaced partition in terms of log(α0) (see the attached codes for details).Wenote that optimal value of the objective function is about-4420.From the right plot,the landscape is indeed rather flat in theα1direction (that is, moving vertically does notchange the objective significantly).4

4. (20 points) Consider the FTSE data-set as in the previous problem.Use the first twothirds of the data as training data; the rest is used for testing (round up if needed).(a) (8 points) Using the training data, fit (i) an ARMA-GARCH model, and (ii) a plainARMA model, both with Gaussian conditional distribution (i.e.,tis a Gaussianwhite noise). Also examine the residuals and comment on the fit of each model.(b) (12 points) For each time pointtin the testing period, use the data up to timet-1to fit both the ARMA-GARCH and ARMA model, with the orders identified in (a).For each model, compute the 95%predictioninterval for the returnrtat timet, thenexamine whether the realized return lies in the interval. For each model, draw a timeseries plot which looks like the following one:020406080100-4-2024Here, the realized return is shown in black, theprediction intervalsare shown by thevertical line segments, and the red ones correspond to the times when the realizedreturn lies outside of theprediction intervals.If the model is adequate, we expectthat the interval contains the realized return about 95% of the time. Comment onthe results you get.(a) We first plot a GARCH(1,1) model (this is essentially the default choice in GARCHmodelling). The ACFs of the normalized residuals and their squares (shown in Figure 6)indicate good fit. Also the normal qq-plot (not shown) fits very closely to a straight line.Then we consider fitting an ARMA(p, q)-GARCH(1,1) model. In fact, the autocorrela-tions of the returns are very small. From a BIC analysis, we’d pickp=q= 0, i.e., fromthe ARMA-perspective we simply fit the return as a white noise.(b) We show the intervals in Figure 7.We see that the width of the interval adjustsaccording to the one-step volatility forecast. The empirical coverage probability is 94.6%which is very close to 95%.5. (10 points) Problem 5.12 of [SS]. Restrict to VAR models.We usevar::VARselect()to do model selection. Since a linear trend has been removed,we usetype = "none". Using BIC, we choosep= 2. (You can still fit the model withconstants and trends but those coefficients are insignificant.)The (cross) ACFs of theresiduals are shown in Figure 8.Most values are not significant.If we use 15 lags, themultivariate Portmanteau test gives thep-value 0.155. Nevertheless, this result dependson the lag used. If we use 5 lags, thep-value becomes 0.0079.5

05101520250.00.20.40.60.81.0LagsACFACF of Standardized Residuals05101520250.00.20.40.60.81.0LagsACFACF of Squared Standardized ResidualsFigure 6:Diagnostic plots for GARCH(1,1) fit.050100150-0.03-0.010.010.03Figure 7:95% prediction intervals.Remark: Whether we accept this model is a decision made by the modeller. In this case,we may still accept it as a reasonable model. (The more problematic choice is the lineartrend assumed in the problem statement.)6. (15 points) Problem 6.1 of [SS].(a) Let the state bext=xtxt-1>which is two-dimensional. Thenxt=xtxt-1=0-0.910xt-1xt-2+wt0=: Φxt-1+wt.Follows a VAR(1) process. Thenyt=10xtxt-1>+vt,6

051015202530-0.50.00.51.0LagACFunemp051015202530-0.50.00.51.0Lagunmp & gnp051015202530-0.50.00.51.0Lagunmp & cnsm-30-25-20-15-10-50-0.50.00.51.0LagACFgnp & unmp051015202530-0.50.00.51.0Laggnp051015202530-0.50.00.51.0Laggnp & cnsm-30-25-20-15-10-50-0.50.00.51.0LagACFcnsm & unmp-30-25-20-15-10-50-0.50.00.51.0Lagcnsm & gnp051015202530-0.50.00.51.0LagconsumFigure 8:(Cross) ACFs of the residuals from VAR(2) fit.soAt=10 .(b) We havey1=x1+v1=-0.9x-1+w1+v1∼N(0,(-0.9)2σ21+σ2w+σ2v)andy2=x2+v2=-0.9x0+w2+v2∼N(0,(-0.9)2σ20+σ2w+σ2v).First, we must haveσ0=σ1fory1andy2to have the same variance.Next, observe that by construction (x1, x3, x5, . . .) is independent from (x2, x4, x6, . . .).Consequently, (y1, y3, y3, . . .) is independent from (y2, y4, y6, . . .).Also considery3=x3+v3=-0.9x1+w3+v3=-0.9(-0.9x-1+w1) +w3+v3= 0.81x-1-0.9w1+w3+v37

which has variance0.812σ21+ 0.92σ2w+σ2w+σ2v.But this must be equal to Var(y1) = (-0.9)2σ21+σ2w+σ2v. Solving the equation gives0.94σ21+ 0.92σ2w= 0.92σ21⇒σ21=σ21=σ2w1-0.92.Thus we also haveσ21=σ2w1-0.92.Note: This is also the value which makes (xt) stationary.(c) In Figure 9 we plot a simulated trajectory of (xt, yt) of length 100. Clearlyxtandytare strongly correlated.-6-224x-6-22020406080100yTimecbind(x, y)Figure 9:Trajectory of (xt, yt) with length 100.Sizen= 100 is a small sample size (thus sampling error is large), in the following wecompute the sample ACF and PACF withn= 10000 instead.We show the result inFigure 10. We observe that the ACFs look similar. However, the PACF of (xt) is non-zeroonly at lag 2 (note that (xt) is an AR(2) process with a double root). On the other handthe PACF of (yt) also decays.(d) Withσv= 10, we plot the results in Figures 11 and 12. Since the observation noise islarger, the visual relationship is much weaker. Also, we observe that the ACF and PACFof (yt) are much smaller. (Since (yt) is more dominated by (vt), it becomes more similarto a white noise.)7. (15 points) Problem 6.3 of [SS].We note thatyt-1t=E[xt+vt|y1:(t-1)] =xt-1t.and its mean squared error isPt-1t+σ2v.8

010203040-0.50.5LagACFSeries x010203040-0.8-0.40.0LagPartial ACFSeries x010203040-0.50.5LagACFSeries y010203040-0.60.0LagPartial ACFSeries yFigure 10:Sample ACF and PACF ofxandy.-40246x-20020020406080100yTimecbind(x, y)Figure 11:Trajectory of (xt, yt) with length 100.We show the required plot in Figure 13. Here the dots show the observationsyt. The bluesolid curve gives the one-step forecast, and the blue dashed curves show the 95% bounds(i.e.,yt-1t±2qPt-1t+σ2v).9

010203040-0.50.5LagACFSeries x010203040-0.8-0.2LagPartial ACFSeries x0102030400.00.40.8LagACFSeries y010203040-0.060.00LagPartial ACFSeries yFigure 12:Sample ACF and PACF ofxandy.020406080100-6-4-20246Indexxy$yFigure 13:Plot for Problem 7.10