#### Overview of this book

Introduction to R for Quantitative Finance will show you how to solve real-world quantitative fi nance problems using the statistical computing language R. The book covers diverse topics ranging from time series analysis to fi nancial networks. Each chapter briefl y presents the theory behind specific concepts and deals with solving a diverse range of problems using R with the help of practical examples.This book will be your guide on how to use and master R in order to solve quantitative finance problems. This book covers the essentials of quantitative finance, taking you through a number of clear and practical examples in R that will not only help you to understand the theory, but how to effectively deal with your own real-life problems.Starting with time series analysis, you will also learn how to optimize portfolios and how asset pricing models work. The book then covers fixed income securities and derivatives such as credit risk management.
Introduction to R for Quantitative Finance
Credits
www.PacktPub.com
Preface
Free Chapter
Time Series Analysis
Portfolio Optimization
Asset Pricing Models
Fixed Income Securities
Estimating the Term Structure of Interest Rates
Derivatives Pricing
Credit Risk Management
Extreme Value Theory
References
Index

## Cointegration

The idea behind cointegration, a concept introduced by Granger (1981) and formalized by Engle and Granger (1987), is to find a linear combination between non-stationary time series that result in a stationary time series. It is hence possible to detect stable long-run relationships between non-stationary time series (for example, prices).

### Cross hedging jet fuel

Airlines are natural buyers of jet fuel. Since the price of jet fuel can be very volatile, most airlines hedge at least part of their exposure to jet fuel price changes. In the absence of liquid jet fuel OTC instruments, airlines use related exchange traded futures contracts (for example, heating oil) for hedging purposes. In the following section, we derive the optimal hedge ratio using first the classical approach of taking into account only the short-term fluctuations between the two prices; afterwards, we improve on the classical hedge ratio by taking into account the long-run stable relationship between the prices as well.

We first load the necessary libraries. The `urca` library has some useful methods for unit root tests and for estimating cointegration relationships.

```> library("zoo")
> install.packages("urca")
> library("urca")
```

We import the monthly price data for jet fuel and heating oil (in USD per gallon).

```> prices <- read.zoo("JetFuelHedging.csv", sep = ",",+   FUN = as.yearmon, format = "%Y-%m", header = TRUE)
```

Taking into account only the short-term behavior (monthly price changes) of the two commodities, one can derive the minimum variance hedge by fitting a linear model that explains changes in jet fuel prices by changes in heating oil prices. The beta coefficient of that regression is the optimal hedge ratio.

```> simple_mod <- lm(diff(prices\$JetFuel) ~ diff(prices\$HeatingOil)+0)
```

The function `lm` (for linear model) estimates the coefficients for a best fit of changes in jet fuel prices versus changes in heating oil prices. The `+0` term means that we set the intercept to zero; that is, no cash holdings.

```> summary(simple_mod)
Call:
lm(formula = diff(prices\$JetFuel) ~ diff(prices\$HeatingOil) +
0)
Residuals:
Min       1Q   Median       3Q      Max
-0.52503 -0.02968  0.00131  0.03237  0.39602

Coefficients:
Estimate Std. Error t value Pr(>|t|)
diff(prices\$HeatingOil)  0.89059    0.03983   22.36   <2e-16 ***
---
Signif. codes:  0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1

Residual standard error: 0.0846 on 189 degrees of freedom
Multiple R-squared:  0.7257,   Adjusted R-squared:  0.7242
F-statistic: 499.9 on 1 and 189 DF,  p-value: < 2.2e-16
```

We obtain a hedge ratio of 0.89059 and a residual standard error of 0.0846. The cross hedge is not perfect; the resulting hedged portfolio is still risky.

We now try to improve on this hedge ratio by using an existing long-run relationship between the levels of jet fuel and heating oil futures prices. You can already guess the existence of such a relationship by plotting the two price series (heating oil prices will be in red) using the following command:

```> plot(prices\$JetFuel, main = "Jet Fuel and Heating Oil Prices",+   xlab = "Date", ylab = "USD")
> lines(prices\$HeatingOil, col = "red")
```

We use Engle and Granger's two-step estimation technique. Firstly, both time series are tested for a unit root (non-stationarity) using the augmented Dickey-Fuller test.

```> jf_adf <- ur.df(prices\$JetFuel, type = "drift")
###############################################
# Augmented Dickey-Fuller Test Unit Root Test #
###############################################

Test regression drift

Call:
lm(formula = z.diff ~ z.lag.1 + 1 + z.diff.lag)

Residuals:
Min       1Q   Median       3Q      Max
-1.06212 -0.05015  0.00566  0.07922  0.38086

Coefficients:
Estimate Std. Error t value Pr(>|t|)
(Intercept)  0.03050    0.02177   1.401  0.16283
z.lag.1     -0.01441    0.01271  -1.134  0.25845
z.diff.lag   0.19471    0.07250   2.686  0.00789 **
---
Signif. codes:  0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1

Residual standard error: 0.159 on 186 degrees of freedom
Multiple R-squared:  0.04099,   Adjusted R-squared:  0.03067
F-statistic: 3.975 on 2 and 186 DF,  p-value: 0.0204

Value of test-statistic is: -1.1335 0.9865

Critical values for test statistics:
1pct  5pct 10pct
tau2 -3.46 -2.88 -2.57
phi1  6.52  4.63  3.81
```

The null hypothesis of non-stationarity (jet fuel time series contains a unit root) cannot be rejected at the 1% significance level since the test statistic of -1.1335 is not more negative than the critical value of -3.46. The same holds true for heating oil prices (the test statistic is -1.041).

```> ho_adf <- ur.df(prices\$HeatingOil, type = "drift")
```

We can now proceed to estimate the static equilibrium model and test the residuals for a stationary time series using an augmented Dickey-Fuller test. Please note that different critical values [for example, from Engle and Yoo (1987)] must now be used since the series under investigation is an estimated one.

```> mod_static <- summary(lm(prices\$JetFuel ~ prices\$HeatingOil))
> error <- residuals(mod_static)
> error_cadf <- ur.df(error, type = "none")
```

The test statistic obtained is -8.912 and the critical value for a sample size of 200 at the 1% level is -4.00; hence we reject the null hypothesis of non-stationarity. We have thus discovered two cointegrated variables and can proceed with the second step; that is, the specification of an Error-Correction Model (ECM). The ECM represents a dynamic model of how (and how fast) the system moves back to the static equilibrium estimated earlier and is stored in the `mod_static` variable.

```> djf <- diff(prices\$JetFuel)
> dho <- diff(prices\$HeatingOil)
> error_lag <- lag(error, k = -1)
> mod_ecm <- lm(djf ~ dho + error_lag)
> summary(mod_ecm)

Call:
lm(formula = djf ~ dho + error_lag + 0)

Residuals:
Min       1Q   Median       3Q      Max
-0.19158 -0.03246  0.00047  0.02288  0.45117

Coefficients:
Estimate Std. Error t value Pr(>|t|)
dho        0.90020
0.03238  27.798   <2e-16 ***
error_lag -0.65540    0.06614  -9.909   <2e-16 ***
---
Signif. codes:  0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1

Residual standard error: 0.06875 on 188 degrees of freedom
Multiple R-squared:  0.8198,   Adjusted R-squared:  0.8179
F-statistic: 427.6 on 2 and 188 DF,  p-value: < 2.2e-16
```

By taking into account the existence of a long-run relationship between jet fuel and heating oil prices (cointegration), the hedge ratio is now slightly higher (0.90020) and the residual standard error significantly lower (0.06875). The coefficient of the error term is negative (-0.65540): large deviations between the two prices are going to be corrected and prices move closer to their long-run stable relationship.