Book Image

Python for Finance Cookbook

By : Eryk Lewinson
Book Image

Python for Finance Cookbook

By: Eryk Lewinson

Overview of this book

Python is one of the most popular programming languages used in the financial industry, with a huge set of accompanying libraries. In this book, you'll cover different ways of downloading financial data and preparing it for modeling. You'll calculate popular indicators used in technical analysis, such as Bollinger Bands, MACD, RSI, and backtest automatic trading strategies. Next, you'll cover time series analysis and models, such as exponential smoothing, ARIMA, and GARCH (including multivariate specifications), before exploring the popular CAPM and the Fama-French three-factor model. You'll then discover how to optimize asset allocation and use Monte Carlo simulations for tasks such as calculating the price of American options and estimating the Value at Risk (VaR). In later chapters, you'll work through an entire data science project in the financial domain. You'll also learn how to solve the credit card fraud and default problems using advanced classifiers such as random forest, XGBoost, LightGBM, and stacked models. You'll then be able to tune the hyperparameters of the models and handle class imbalance. Finally, you'll focus on learning how to use deep learning (PyTorch) for approaching financial tasks. By the end of this book, you’ll have learned how to effectively analyze financial data using a recipe-based approach.
Table of Contents (12 chapters)

Backtesting a strategy based on simple moving average

The general idea behind backtesting is to evaluate the performance of a trading strategy—built using some heuristics or technical indicators—by applying it to historical data.

In this recipe, we introduce one of the available frameworks for backtesting in Python: backtrader. Key features of this framework include:

  • A vast amount of available technical indicators (backtrader also provides a wrapper around the popular TA-Lib library) and performance measures
  • Ease of building and applying new indicators
  • Multiple data sources available (including Yahoo Finance, Quandl)
  • Simulating many aspects of real brokers, such as different types of orders (market, limit, stop), slippage (the difference between the intended and actual execution prices of an order), commission, going long/short, and so on
  • A one-line call for a plot, with all results

For this recipe, we consider a basic strategy based on the SMA. The key points of the strategy are as follows:

  • When the close price becomes higher than the 20-day SMA, buy one share.
  • When the close price becomes lower than the 20-day SMA and we have a share, sell it.
  • We can only have a maximum of one share at any given time.
  • No short selling is allowed.

We run the backtesting of this strategy, using Apple's stock prices from the year 2018.

How to do it...

In this example, we present two possible approaches: building a trading strategy, using a signal (bt.Signal) or defining a full strategy (bt.Strategy). Both yield the same results, however, the lengthier one, using bt.Strategy, provides more logging of what is actually happening in the background. This makes it easier to debug and keep track of all operations (the level of detail included in the logging depends on our needs).

Signal

Execute the following steps to create a backtest, using the bt.Signal class.

  1. Import the libraries:
from datetime import datetime
import backtrader as bt
  1. Define a class representing the trading strategy:
class SmaSignal(bt.Signal):
params = (('period', 20), )

def __init__(self):
self.lines.signal = self.data - bt.ind.SMA(period=self.p.period)
  1. Download data from Yahoo Finance:
data = bt.feeds.YahooFinanceData(dataname='AAPL', 
fromdate=datetime(2018, 1, 1),
todate=datetime(2018, 12, 31))
  1. Set up the backtest:
cerebro = bt.Cerebro(stdstats = False)

cerebro.adddata(data)
cerebro.broker.setcash(1000.0)
cerebro.add_signal(bt.SIGNAL_LONG, SmaSignal)
cerebro.addobserver(bt.observers.BuySell)
cerebro.addobserver(bt.observers.Value)
  1. Run the backtest:
print(f'Starting Portfolio Value: {cerebro.broker.getvalue():.2f}')
cerebro.run()
print(f'Final Portfolio Value: {cerebro.broker.getvalue():.2f}')
  1. Plot the results:
cerebro.plot(iplot=True, volume=False)

The plot is divided into three parts: the evolution of the portfolio's value, the price of the asset (together with the buy/sell signals), and—lastly—the technical indicator of our choosing, as shown in the following plot:

From the preceding plot, we can see that, in the end, the trading strategy made money: the terminal value of the portfolio is $1011.56.

Strategy

To make the code more readable, we first present the general outline of the class (trading strategy) and then define separate pieces in the following code blocks.

  1. The template of the strategy is presented below:
class SmaStrategy(bt.Strategy):
params = (('ma_period', 20), )

def __init__(self):
# some code

def log(self, txt):
# some code

def notify_order(self, order):
# some code

def notify_trade(self, trade):
# some code

def next(self):
# some code

The __init__ block is defined as:

def __init__(self):
self.data_close = self.datas[0].close

self.order = None
self.price = None
self.comm = None

self.sma = bt.ind.SMA(self.datas[0],
period=self.params.ma_period)

The log block is defined as:

def log(self, txt):
dt = self.datas[0].datetime.date(0).isoformat()
print(f'{dt}, {txt}')

The notify_order block is defined as:

def notify_order(self, order):
if order.status in [order.Submitted, order.Accepted]:
return

if order.status in [order.Completed]:
if order.isbuy():
self.log(f'BUY EXECUTED --- Price: {order.executed.price:.2f}, Cost: {order.executed.value:.2f}, Commission: {order.executed.comm:.2f}')
self.price = order.executed.price
self.comm = order.executed.comm
else:
self.log(f'SELL EXECUTED --- Price: {order.executed.price:.2f}, Cost: {order.executed.value:.2f}, Commission: {order.executed.comm:.2f}')

self.bar_executed = len(self)

elif order.status in [order.Canceled, order.Margin,
order.Rejected]:
self.log('Order Failed')

self.order = None

The notify_trade block is defined as:

def notify_trade(self, trade):
if not trade.isclosed:
return

self.log(f'OPERATION RESULT --- Gross: {trade.pnl:.2f}, Net: {trade.pnlcomm:.2f}')

The next block is defined as:

def next(self):
if self.order:
return

if not self.position:
if self.data_close[0] > self.sma[0]:
self.log(f'BUY CREATED --- Price: {self.data_close[0]:.2f}')
self.order = self.buy()
else:
if self.data_close[0] < self.sma[0]:
self.log(f'SELL CREATED --- Price: {self.data_close[0]:.2f}')
self.order = self.sell()
The code for data is the same as in the signal strategy, so it is not included here, to avoid repetition.
  1. Set up the backtest:
cerebro = bt.Cerebro(stdstats = False)

cerebro.adddata(data)
cerebro.broker.setcash(1000.0)
cerebro.addstrategy(SmaStrategy)
cerebro.addobserver(bt.observers.BuySell)
cerebro.addobserver(bt.observers.Value)
  1. Run the backtest:
print(f'Starting Portfolio Value: {cerebro.broker.getvalue():.2f}')
cerebro.run()
print(f'Final Portfolio Value: {cerebro.broker.getvalue():.2f}')
  1. Plot the results:
cerebro.plot(iplot=True, volume=False)

The resulting graph is presented below:

From the preceding graph, we see that the strategy managed to make $11.56 over the year. Additionally, we present a piece of the log:

The log contains information about all the created and executed trades, as well as the operation results, in case it was a sell.

How it works...

The key idea of working with backtrader is that there is the main brain—Cerebro—and by using different methods, we provided it with historical data, the designed trading strategy, additional metrics we wanted to calculate (for example, Portfolio Value over the investment horizon, or the overall Sharpe ratio), information about commissions/slippage, and so on. These were the common elements between the two approaches. The part that differed was the definition of the strategy. We start by describing the common elements of the backtrader framework while assuming a trading strategy already exists, and we then explain the details of the particular strategies.

Common elements

We started with downloading price data from Yahoo Finance, with the help of the bt.feeds.YahooFinanceData() function. What followed was a series of operations connected to Cerebro, as described here:

  1. Creating the instance of bt.Cerebro and setting stdstats = False, in order to suppress a lot of default elements of the plot. Doing so avoided cluttering the output, and then we manually picked the interesting elements (observers and indicators).
  2. Adding data, using the adddata method.
  3. Setting up the amount of available money, using the broker.setcash method.
  4. Adding the signal/strategy, using the add_signal/addstrategy methods.
  5. Adding Observers, using addobserver. We selected two Observers: BuySell, to display the buy/sell decisions on the plot (denoted by blue and red triangles), and Value, for tracking how the portfolio value changed over time.
You can also add data from a CSV file, a pandas DataFrame, Quandl, and other sources. For a list of available options, please refer to bt.feeds.

The last step involved running the backtest with cerebro.run() and displaying the resulting plot with cerebro.plot(). In the latter step, we disabled displaying the volume bar charts, to avoid cluttering the graph.

Signal

The signal was built as a class, inheriting from bt.Signal. The signal was represented as a number—in this case, the difference between the current data point (self.data) and the moving average (bt.ind.SMA). If the signal is positive, it is an indication to go long (buy). A negative one indicates short (selling). The value of 0 means there is no signal.

The next step was to add the signal to Cerebro, using the add_signal method. When doing so, we also had to specify what kind of signal we were adding.

The following is a description of the available signal types:

  • LONGSHORT: This takes into account both long and short indications from the signal.
  • LONG: Positive signals indicate going long; negative ones are used to close the long position.
  • SHORT: Negative signals indicate shorting; positive ones are used to close the short position.
  • LONGEXIT: A negative signal is used to exit a long position.
  • SHORTEXIT: A positive signal is used to exit a short position.

However, exiting positions can be more complex (enabling users to build more sophisticated strategies), as described here:

  • LONG: If there is a LONGEXIT signal, it is used to exit the long position, instead of the default behavior mentioned previously. If there is a SHORT signal and no LONGEXIT signal, the SHORT signal is used to close the long position before opening a short one.
  • SHORT: If there is a SHORTEXIT signal, it is used to exit the short position, instead of the default behavior mentioned previously. If there is a LONG signal and no SHORTEXIT signal, the LONG signal is used to close the short position before opening a long one.
As you might have already realized, the signal is calculated for every time point (as visualized in the bottom of the plot), which effectively creates a continuous stream of positions to be opened/closed (the signal value of 0 is not very likely to happen). That is why backtrader, by default, disables accumulation (the constant opening of new positions, even when we have one already opened) and concurrency (generating new orders without hearing back from the broker whether the previously submitted ones were executed successfully).

Strategy

The strategy was built as a class, inheriting from bt.Strategy. Inside the class, we defined the following methods (we were actually overwriting them to make them tailor-made for our needs):

  • __init__: Here, we defined the objects that we would like to keep track of, for example, close price, order, buy price, commission, indicators such as SMA, and so on.
  • log: This is defined for logging purposes.
  • notify_order: This is defined for reporting the status of the order (position). In general, on day t, the indicator can suggest opening/closing a position based on the close price (assuming we are working with daily data). Then, the (market) order will be carried out on the next day (using day t + 1's open price). However, there is no guarantee that the order will be executed, as it can be canceled, or we might have insufficient cash. This behavior is also true for strategies built with signals. It also removes any pending order, by setting self.order = None.
  • notify_trade: This is defined for reporting the results of trades (after the positions are closed).
  • next: This is the place containing the trading strategy's logic. First, we check whether there is an order already pending, and do nothing if there is. The second check is to see whether we already have a position (enforced by our strategy; not a must), and, if we do not, we check whether the close price is higher than the moving average. A positive outcome results in an entry to the log, and the placing of a buy order self.order = self.buy(). This is also the place where we can choose the stake (number of assets we want to buy). A default outcome in self.buy(size=1).

Here are some general notes:

  • Cerebro should only be used once. If we want to run another backtest, we should create a new instance, not add something to it after prior calculations.
  • The strategy built on bt.Signal inherits from bt.Signal, and uses only one signal. However, we can combine multiple signals, based on different conditions, when we use bt.SignalStrategy instead.
  • When we do not specify otherwise, all trades are carried out on one unit of the asset.
  • backtrader automatically handles the warm-up period. In this case, no trade can be carried out until there are enough data points to calculate the 20-day SMA. When considering multiple indicators at once, backtrader automatically selects the longest necessary period.

There's more...

It is worth mentioning that backtrader has parameter optimization capabilities, which we present in the code that follows. The code is a modified version of the strategy from this recipe, in which we optimize the number of days in the SMA.

The following list provides details of modifications to the code (we only show the relevant ones, as the bulk of the code is identical to that using bt.Strategy):

  • We add an extra attribute called stop to the class definition—it returns the Terminal portfolio value for each parameter:
def stop(self):
self.log(f'(ma_period = {self.params.ma_period:2d}) --- Terminal Value: {self.broker.getvalue():.2f}')
  • Instead of using cerebro.addstrategy(), we use cerebro.optstrategy(), and provide the strategy name and parameter values:
cerebro.optstrategy(SmaStrategy, ma_period=range(10, 31))
  • We increase the number of CPU cores when running the backtesting: cerebro.run(maxcpus=4)

We present the results in the following summary (the order of parameters is not preserved, as the testing was carried out on four cores):

We see that the strategy performed best for ma_period = 22.

See also

Additional resources are available here:

  • https://www.zipline.io/: An alternative framework for backtesting. Developed and actively maintained by Quantopian.