Python for Finance - Second Edition

Python for Finance - Second Edition

By : Yuxing Yan

5 (1)

Buy this Book

Python for Finance - Second Edition

5 (1)

By: Yuxing Yan

Buy this Book

Overview of this book

This book uses Python as its computational tool. Since Python is free, any school or organization can download and use it. This book is organized according to various finance subjects. In other words, the first edition focuses more on Python, while the second edition is truly trying to apply Python to finance. The book starts by explaining topics exclusively related to Python. Then we deal with critical parts of Python, explaining concepts such as time value of money stock and bond evaluations, capital asset pricing model, multi-factor models, time series analysis, portfolio theory, options and futures. This book will help us to learn or review the basics of quantitative finance and apply Python to solve various problems, such as estimating IBM’s market risk, running a Fama-French 3-factor, 5-factor, or Fama-French-Carhart 4 factor model, estimating the VaR of a 5-stock portfolio, estimating the optimal portfolio, and constructing the efficient frontier for a 20-stock portfolio with real-world stock, and with Monte Carlo Simulation. Later, we will also learn how to replicate the famous Black-Scholes-Merton option model and how to price exotic options such as the average price call option.

Python for Finance Second Edition

Credits

About the Author

About the Reviewers

www.PacktPub.com

Customer Feedback

Preface

Free Chapter

Python Basics

Python installation

Variable assignment, empty space, and writing our own programs

Writing a Python function

Summary

Introduction to Python Modules

What is a Python module?

Introduction to NumPy

Introduction to SciPy

Introduction to matplotlib

Introduction to statsmodels

Introduction to pandas

Python modules related to finance

Introduction to the pandas_reader module

Two financial calculators

How to install a Python module

Module dependency

Exercises

Summary

Time Value of Money

Introduction to time value of money

Writing a financial calculator in Python

Definition of NPV and NPV rule

Definition of IRR and IRR rule

Definition of payback period and payback period rule

Writing your own financial calculator in Python

Two general formulae for many functions

Exercises

Summary

Sources of Data

Diving into deeper concepts

Bond and Stock Valuation

Introduction to interest rates

Term structure of interest rates

Bond evaluation

Stock valuation

A new data type – dictionary

Summary

Capital Asset Pricing Model

Introduction to CAPM

Moving beta

Adjusted beta

Extracting output data

Simple string manipulation

Python via Canopy

References

Exercises

Summary

Multifactor Models and Performance Measures

Introduction to the Fama-French three-factor model

Fama-French three-factor model

Fama-French-Carhart four-factor model and Fama-French five-factor model

Implementation of Dimson (1979) adjustment for beta

Performance measures

How to merge different datasets

References

Exercises

Summary

Time-Series Analysis

Introduction to time-series analysis

Merging datasets based on a date variable

Understanding the interpolation technique

Tests of normality

52-week high and low trading strategy

Estimating Roll's spread

Estimating Amihud's illiquidity

Estimating Pastor and Stambaugh (2003) liquidity measure

Fama-MacBeth regression

Durbin-Watson

Python for high-frequency data

Spread estimated based on high-frequency data

Introduction to CRSP

References

Exercises

Summary

Portfolio Theory

Introduction to portfolio theory

A 2-stock portfolio

Optimization – minimization

Forming an n-stock portfolio

Constructing an optimal portfolio

Constructing an efficient frontier with n stocks

References

Exercises

Summary

Options and Futures

Introducing futures

Payoff and profit/loss functions for call and put options

European versus American options

Black-Scholes-Merton option model on non-dividend paying stocks

Generating our own module p4f

European options with known dividends

Various trading strategies

Put-call parity and its graphic presentation

Binomial tree and its graphic presentation

Hedging strategies

Implied volatility

Binary-search

Retrieving option data from Yahoo! Finance

Volatility smile and skewness

References

Exercises

Summary

Value at Risk

Introduction to VaR

Normality tests

Skewness and kurtosis

Modified VaR

VaR based on sorted historical returns

Simulation and VaR

VaR for portfolios

Backtesting and stress testing

Expected shortfall

References

Exercises

Summary

Monte Carlo Simulation

Importance of Monte Carlo Simulation

Generating random numbers from a standard normal distribution

Generating random numbers with a seed

Generating random numbers from a uniform distribution

Using simulation to estimate the pi value

Generating random numbers from a Poisson distribution

Selecting m stocks randomly from n given stocks

With/without replacements

Distribution of annual returns

Simulation of stock price movements

Graphical presentation of stock prices at options' maturity dates

Replicating a Black-Scholes-Merton call using simulation

Liking two methods for VaR using simulation

Capital budgeting with Monte Carlo Simulation

Python SimPy module

Comparison between two social policies – basic income and basic job

Finding an efficient frontier based on two stocks by using simulation

Constructing an efficient frontier with n stocks

Long-term return forecasting

Efficiency, Quasi-Monte Carlo, and Sobol sequences

References

Exercises

Summary

Credit Risk Analysis

Introduction to credit risk analysis

Credit rating

Credit spread

YIELD of AAA-rated bond, Altman Z-score

Using the KMV model to estimate the market value of total assets and its volatility

Term structure of interest rate

Summary

Exotic Options

European, American, and Bermuda options

Pricing average options

Pricing barrier options

Barrier in-and-out parity

Graph of up-and-out and up-and-in parity

Pricing lookback options with floating strikes

References

Exercises

Summary

Volatility, Implied Volatility, ARCH, and GARCH

Conventional volatility measure – standard deviation

Tests of normality

Estimating fat tails

Lower partial standard deviation and Sortino ratio

Test of equivalency of volatility over two periods

Test of heteroskedasticity, Breusch, and Pagan

Volatility smile and skewness

Graphical presentation of volatility clustering

The ARCH model

Simulating an ARCH (1) process

The GARCH model

Simulating a GARCH process

Simulating a GARCH (p,q) process using modified garchSim()

GJR_GARCH by Glosten, Jagannanthan, and Runkle

References

Exercises

Summary

Index

Customer Reviews

5 (1)

5 star

100%

4 star

3 star

2 star

1 star

Data manipulation

There are many different types of data, such as integer, real number, or string. The following table offers a list of those data types:

Data types	Description
`Bool`	Boolean (`TRUE` or `FALSE`) stored as a byte
`Int`	Platform integer (normally either `int32` or `int64`)
`int8`	Byte (`-128` to `127`)
`int16`	Integer (`-32768` to `32767`)
`int32`	Integer (`-2147483648` to `2147483647`)
`int64`	Integer (`9223372036854775808` to `9223372036854775807`)
`unit8`	Unsigned integer (`0` to `255`)
`unit16`	Unsigned integer (`0` to `65535`)
`unit32`	Unsigned integer (`0` to `4294967295`)
`unit64`	Unsigned integer (`0` to `18446744073709551615`)
`float`	Short and for `float6`
`float32`	Single precision float: sign `bit23` bits mantissa; 8 bits exponent
`float64`	52 bits mantissa
`complex`	Shorthand for `complex128`
`complex64`	Complex number; represented by two 32-bit floats (real and imaginary components)
`complex128`	Complex number; represented by two 64-bit floats (real and imaginary components)

Table 1.1 List of different data types

In the following examples, we assign a value to r, which is a scalar, and several values to pv, which is an array (vector).The type() function is used to show their types:

>>> import numpy as np
>>> r=0.023
>>>pv=np.array([100,300,500])
>>>type(r)
<class'float'>
>>>type(pv)
<class'numpy.ndarray'>

To choose the appropriate decision, we use the round()function; see the following example:

>>> 7/3
2.3333333333333335
>>>round(7/3,5)
2.33333
>>>

For data manipulation, let's look at some simple operations:

>>>import numpy as np
>>>a=np.zeros(10)                      # array with 10 zeros 
>>>b=np.zeros((3,2),dtype=float)       # 3 by 2 with zeros 
>>>c=np.ones((4,3),float)              # 4 by 3 with all ones 
>>>d=np.array(range(10),float)         # 0,1, 2,3 .. up to 9 
>>>e1=np.identity(4)                   # identity 4 by 4 matrix 
>>>e2=np.eye(4)                        # same as above 
>>>e3=np.eye(4,k=1)                    # 1 start from k 
>>>f=np.arange(1,20,3,float)           # from 1 to 19 interval 3 
>>>g=np.array([[2,2,2],[3,3,3]])       # 2 by 3 
>>>h=np.zeros_like(g)                  # all zeros 
>>>i=np.ones_like(g)                   # all ones

Some so-called dot functions are quite handy and useful:

>>> import numpy as np
>>> x=np.array([10,20,30])
>>>x.sum()
60

Anything after the number sign of # will be a comment. Arrays are another important data type:

>>>import numpy as np
>>>x=np.array([[1,2],[5,6],[7,9]])      # a 3 by 2 array
>>>y=x.flatten()
>>>x2=np.reshape(y,[2,3]              ) # a 2 by 3 array

We could assign a string to a variable:

>>> t="This is great"
>>>t.upper()
'THIS IS GREAT'
>>>

To find out all string-related functions, we use dir(''); see the following code:

>>>dir('')
['__add__', '__class__', '__contains__', '__delattr__', '__dir__', '__doc__', '__eq__', '__format__', '__ge__', '__getattribute__', '__getitem__', '__getnewargs__', '__gt__', '__hash__', '__init__', '__iter__', '__le__', '__len__', '__lt__', '__mod__', '__mul__', '__ne__', '__new__', '__reduce__', '__reduce_ex__', '__repr__', '__rmod__', '__rmul__', '__setattr__', '__sizeof__', '__str__', '__subclasshook__', 'capitalize', 'casefold', 'center', 'count', 'encode', 'endswith', 'expandtabs', 'find', 'format', 'format_map', 'index', 'isalnum', 'isalpha', 'isdecimal', 'isdigit', 'isidentifier', 'islower', 'isnumeric', 'isprintable', 'isspace', 'istitle', 'isupper', 'join', 'ljust', 'lower', 'lstrip', 'maketrans', 'partition', 'replace', 'rfind', 'rindex', 'rjust', 'rpartition', 'rsplit', 'rstrip', 'split', 'splitlines', 'startswith', 'strip', 'swapcase', 'title', 'translate', 'upper', 'zfill']
>>>

For example, from the preceding list we see a function called split. After typinghelp(''.split), we will have related help information:

>>>help(''.split)
Help on built-in function split:

split(...) method of builtins.str instance
S.split(sep=None, maxsplit=-1) -> list of strings

    Return a list of the words in S, using sep as the
delimiter string. If maxsplit is given, at most maxsplit
splits are done. If sep is not specified or is None, any
whitespace string is a separator and empty strings are
removed from the result.
>>>

We could try the following example:

>>> x="this is great"
>>>x.split()
['this', 'is', 'great']
>>>

Matrix manipulation is important when we deal with various matrices:

The condition for equation (3) is that matrices A and B should have the same dimensions. For the product of two matrices, we have the following equation:

Here,A is an n by k matrix (n rows and k columns), while B is a k by m matrix. Remember that the second dimension of the first matrix should be the same as the first dimension of the second matrix. In this case, it is k. If we assume that the individual data items in C, A, and B are Ci,j (the ith row and the jth column), Ai,j, and Bi,j, we have the following relationship between them:

The dot() function from the NumPy module could be used to carry the preceding matrix multiplication:

>>>a=np.array([[1,2,3],[4,5,6]],float)    # 2 by 3
>>>b=np.array([[1,2],[3,3],[4,5]],float)  # 3 by 2
>>>np.dot(a,b)                            # 2 by 2
>>>print(np.dot(a,b))
array([[ 19.,  23.],
[ 43.,  53.]])
>>>

We could manually calculate c(1,1): 1*1 + 2*3 + 3*4=19.

After retrieving data or downloading data from the internet, we need to process it. Such a skill to process various types of raw data is vital to finance students and to professionals working in the finance industry. Here we will see how to download price data and then estimate returns.

Assume that we have n values of x1, x2, … and xn. There exist two types of means: arithmetic mean and geometric mean; see their genetic definitions here:

Assume that there exist three values of 2,3, and 4. Their arithmetic and geometric means are calculated here:

>>>(2+3+4)/3.
>>>3.0
>>>geo_mean=(2*3*4)**(1./3)
>>>round(geo_mean,4) 
2.8845

For returns, the arithmetic mean's definition remains the same, while the geometric mean of returns is defined differently; see the following equations:

In Chapter 3, Time Value of Money, we will discuss both means again.

We could say that NumPy is a basic module while SciPy is a more advanced one. NumPy tries to retain all features supported by either of its predecessors, while most new features belong in SciPy rather than NumPy. On the other hand, NumPy and SciPy have many overlapping features in terms of functions for finance. For those two types of definitions, see the following example:

>>> import scipy as sp
>>> ret=sp.array([0.1,0.05,-0.02])
>>>sp.mean(ret)
0.043333333333333342
>>>pow(sp.prod(ret+1),1./len(ret))-1 
0.042163887067679262

Our second example is related to processing theFama-French 3 factor time series. Since this example is more complex than the previous one, if a user feels it is difficult to understand, he/she could simply skip this example. First, a ZIP file called F-F_Research_Data_Factor_TXT.zip could be downloaded from Prof. French's Data Library. After unzipping and removing the first few lines and annual datasets, we will have a monthly Fama-French factor time series. The first few lines and last few lines are shown here:

DATE    MKT_RFSMBHMLRF
192607    2.96   -2.30   -2.87    0.22
192608    2.64   -1.40    4.19    0.25
192609    0.36   -1.32    0.01    0.23

201607    3.95    2.90   -0.98    0.02
201608    0.49    0.94    3.18    0.02
201609    0.25    2.00   -1.34    0.02

Assume that the final file is called ffMonthly.txt under c:/temp/. The following program is used to retrieve and process the data:

import numpy as np
import pandas as pd
file=open("c:/temp/ffMonthly.txt","r")
data=file.readlines()
f=[]
index=[]
for i in range(1,np.size(data)):
    t=data[i].split()
    index.append(int(t[0]))
    for j in range(1,5):
        k=float(t[j])
        f.append(k/100)
n=len(f) 
f1=np.reshape(f,[n/4,4])
ff=pd.DataFrame(f1,index=index,columns=['Mkt_Rf','SMB','HML','Rf'])

To view the first and last few observations for the dataset called ff, the functions of .head() and .tail()can be used:

Python for Finance - Second Edition

By : Yuxing Yan

Python for Finance - Second Edition

By: Yuxing Yan

Overview of this book

Related Content you might be interested in

Current Title:

Python for Finance - Second Edition

Hands-On Data Science with Anaconda

Learning Quantitative Finance with R

Mastering Python for Finance.

Data manipulation