#### Overview of this book

Scala for Machine Learning
Credits
www.PacktPub.com
Preface
Free Chapter
Getting Started
Hello World!
Data Preprocessing
Unsupervised Learning
Naïve Bayes Classifiers
Regression and Regularization
Sequential Data Models
Kernel Models and Support Vector Machines
Artificial Neural Networks
Genetic Algorithms
Reinforcement Learning
Scalable Frameworks
Basic Concepts
Index

## Finances 101

The exercises presented throughout this book are related to historical financial data and require the reader to have some basic understanding of financial markets and reports.

### Fundamental analysis

Fundamental analysis is a set of techniques used to evaluate a security (stock, bond, currency, or commodity) that entails attempting to measure its intrinsic value by examining both macro and micro financial and economy reports. Fundamental analysis is usually applied to estimate the optimal price of a stock using a variety of financial ratios.

Numerous financial metrics are used throughout this book. Here are the definitions of the most commonly used metrics [A:16]:

• Earnings per share (EPS): This is the ratio of net earnings to the number of outstanding shares.

• Price/earnings ratio (PE): This is the ratio of the market price per share to earnings per share.

• Price/sales ratio (PS): This is the ratio of the market price per share to gross sales (or revenue).

• Price/book value ratio (PB): This is the ratio of the market price per share to the total balance sheet value per share.

• Price to earnings/growth (PEG): This is the ratio of price/earnings per share (PE) to the annual growth of earnings per share.

• Operating income: This is the difference between the operating revenue and operating expenses.

• Net sales: This is the difference between the revenue or gross sales and cost of goods or cost of sales.

• Operating profit margin: This is the ratio of the operating income to the net sales.

• Net profit margin: This is the ratio of the net profit to the net sales (or the net revenue).

• Short interest: This is the quantity of shares sold short and not yet covered.

• Short interest ratio: This is the ratio of the short interest to the total number of shares floated.

• Cash per share: This is the ratio of the value of cash per share to the market price per share.

• Pay-out ratio: This is the percentage of the primary/basic earnings per share, excluding extraordinary items paid to common stockholders in the form of cash dividends.

• Annual dividend yield: This is the ratio of the sum of dividends paid during the previous 12-month rolling period over the current stock price. Regular and extra dividends are included.

• Dividend coverage ratio: This is the ratio of the income available to common stockholders, excluding extraordinary items, for the most recent trailing 12 months to gross dividends paid to common shareholders, expressed as percent.

• Gross Domestic Product (GDP): This is the aggregate measure of the economic output of a country. It actually measures the sum of values added by the production of goods and delivery of services.

• Consumer Price Index (CPI): This is an indicator that measures the change in the price of an arbitrary basket of goods and services used by the Bureau of Labor Statistics to evaluate the inflationary trend.

• Federal Fund rate: This is the interest rate at which banks trade balances held at the Federal Reserve. The balances are called Federal Funds.

### Technical analysis

Technical analysis is a methodology used to forecast the direction of the price of any given security through the study of the past market information derived from price and volume. In simpler terms, it is the study of price activity and price patterns in order to identify trade opportunities [A:17]. The price of a stock, commodity, bond, or financial future reflects all the information publicly known about that asset as processed by the market participants.

#### Terminology

• Bearish or bearish position: This attempts to profit by betting that the prices of the security will fall.

• Bullish or bullish position: This attempts to profit by betting that the price of the security will rise.

• Long position: This is the same as Bullish.

• Neutral position: This attempts to profit by betting that the price of the security will not change significantly.

• Oscillator: This is a technical indicator that measures the price momentum of a security using some statistical formula.

• Overbought: This is a security that is overbought when its price rises too fast as measured by one or several trading signals or indicators.

• Oversold: This is a security that is oversold when its price drops too fast as measured by one or several trading signals or indicators.

• Relative strength index (RSI): This is an oscillator that computes the average of number of trading sessions for which the closing price is higher than the opening price over the average of number of trading sessions for which the closing price is lower than the opening price. The value is normalized over [0, 1] or [0, 100%].

• Resistance: This is the upper limit of the price range of a security. The price falls back as soon as it reaches the resistance level.

• Short position: This is the same as Bearish.

• Support: This is the lower limit of the price range of a security over a period of time. The price bounces back as soon as it reaches the support level.

• Technical indicator: This is a variable derived from the price of a security and possibly its trading volume.

• Trading range: The trading range for a security over a period of time is the difference between the highest and lowest price for this period of time.

• Trading signal: This is a signal that is triggered when a technical indicator reaches a predefined value, upward or downward.

• Volatility: This is the variance or standard deviation of the price of a security over a period of time.

The raw trading data extracted from Google or Yahoo financials pages consists of the following:

• open: This is the price of the security at the opening of the trading session

• high: This is the highest price of the security during the trading session

• low: This is the lowest price of the security during the trading session

Let's take a look at the following graph:

We can derive the following metrics from the raw trading data:

• Price volatility: volatility = 1.0 – high/low

• Price variation: vPrice = adjClose – open

• Price difference (or change) between two consecutive sessions: dPrice = adjClose – prevClose = adjClose(t) – adjClose(t-1)

• Volume difference between two consecutive sessions: dVolume = volume(t)/volume(t-1) – 1.0

• Volatility difference between two consecutive sessions: dVolatility = volatility(t)/volatility(t-1) – 1.0

• Relative price variation over the last T trading days: rPrice = price(t)/average(price over T) – 1.0

• Relative volume variation over the last T trading days: rVolume = volume(t)/average(volume over T) – 1.0

• Relative volatility variation over the last T trading days: rVolatility = volatility(t)/average(volatility over T) – 1.0

The purpose is to create a set variable x, derived from price and volume x= f (price, volume), and then generate predicates x op c for which op is a Boolean operator, such as > or = that compares the value of x to a predetermined threshold c.

Let's consider one of the most common technical indicators derived from price: the relative strength index RSI or the normalized RSI nRSI, whose formulation is provided here as a reference:

### Note

The relative strength index

The RSI for a period of T sessions with po opening price and pc closing price is defined as:

A trading signal is a predicate using a technical indicator nRSIT(t) < 0.2. In trading terminology, a signal is emitted for any time period t for which the predicate is true:

The visualization of oversold and overbought positions using the relative strength index

Traders do not usually rely on a single trading signal to make a rational decision.

For example, if G is the price of gold, I10 is the current rate of the 10-year Treasury bond, and RSIsp500 is the relative strength index of the S&P 500 index, then we can conclude that the increase in the exchange rate of US\$ to the Japanese Yen is maximized for the following trading strategy: {G < \$1170 and I10 > 3.9% and RSIsp500 > 0.6 and RSIsp500 < 0.8}.

#### Price patterns

Technical analysis assumes that historical prices contains some recurring albeit noisy, patterns that can be discovered using statistical methods. The most common patterns used in this book are the trend, support, and resistance levels [A:18], as illustrated in the following chart:

An illustration of trend, support, and resistance levels in technical analysis

An option is a contract that gives the buyer the right, but not the obligation, to buy or sell a security at a specific price on or before a certain date [A:19].

The two types of options are calls and puts, as described here:

• A call gives the holder the right to buy a security at a certain price within a specific period of time. Buyers of calls expect that the price of the security will increase substantially over the strike price before the option expires.

• A put option gives the holder the right to sell a security at a certain price within a specific period of time. Buyers of puts expect that the price of the stock will fall below the strike price before the option expires.

Let's consider a call option contract of 100 shares at a strike price of \$23 for a total cost of \$270 (\$2.7 per option). The maximum loss the holder of the call can incur is the loss of premium or \$270 when the option expires. However, the profit can be potentially almost unlimited. If the price of the security reaches \$36 when the call option expires, the owner will have a profit of (\$36 - \$23)*100 - \$270 = \$1030. The return on the investment is 1030/270 = 380%. Buying and then selling the stock would have generated a return on the investment of 36/24 -1= 50%. This example is simple and does not take into account a transaction fee or margin cost [A:20]:

Let's take a look at the following chart:

An illustration of the pricing of a call option

### Financial data sources

There are numerous sources of financial data available to experiment with machine learning and validation models [A:21]:

• Yahoo finances (stocks, ETFs, and indices): http://finance.yahoo.com