## Working with vectors

There are subtle yet powerful differences between Breeze vectors and Scala's own `scala.collection.Vector`

. As we'll see in this recipe, Breeze vectors have a lot of functions that are linear algebra specific, and the more important thing to note here is that Breeze's vector is a Scala wrapper over `netlib-java`

and most calls to the vector's API delegates the call to it.

Vectors are one of the core components in Breeze. They are containers of homogenous data. In this recipe, we'll first see how to create vectors and then move on to various data manipulation functions to modify those vectors.

In this recipe, we will look at various operations on vectors. This recipe has been organized in the form of the following sub-recipes:

Creating vectors:

Creating a vector from values

Creating a zero vector

Creating a vector out of a function

Creating a vector of linearly spaced values

Creating a vector with values in a specific range

Creating an entire vector with a single value

Slicing a sub-vector from a bigger vector

Creating a Breeze vector from a Scala vector

Vector arithmetic:

Scalar operations

Calculating the dot product of a vector

Creating a new vector by adding two vectors together

Appending vectors and converting a vector of one type to another:

Concatenating two vectors

Converting a vector of int to a vector of double

Computing basic statistics:

Mean and variance

Standard deviation

Find the largest value

Finding the sum, square root and log of all the values in the vector

### Getting ready

In order to run the code, you could either use the Scala or use the Worksheet feature available in the Eclipse Scala plugin (or Scala IDE) or in IntelliJ IDEA. The reason these options are suggested is due to their quick turnaround time.

### How to do it...

Let's look at each of the above sub-recipes in detail. For easier reference, the output of the respective command is shown as well. All the classes that are being used in this recipe are from the `breeze.linalg`

package. So, an `"import breeze.linalg._"`

statement at the top of your file would be perfect.

#### Creating vectors

Let's look at the various ways we could construct vectors. Most of these construction mechanisms are through the `apply`

method of the vector. There are two different flavors of vector—`breeze.linalg.DenseVector`

and `breeze.linalg.SparseVector`

—the choice of the vector depends on the use case. The general rule of thumb is that if you have data that is at least 20 percent zeroes, you are better off choosing `SparseVector`

but then the 20 percent is a variant too.

#### Constructing a vector from values

**Creating a dense vector from values**: Creating a`DenseVector`

from values is just a matter of passing the values to the`apply`

method:**val dense=DenseVector(1,2,3,4,5)****println (dense) //DenseVector(1, 2, 3, 4, 5)****Creating a sparse vector from values**: Creating a`SparseVector`

from values is also through passing the values to the`apply`

method:**val sparse=SparseVector(0.0, 1.0, 0.0, 2.0, 0.0)****println (sparse) //SparseVector((0,0.0), (1,1.0), (2,0.0), (3,2.0), (4,0.0))**

Notice how the `SparseVector`

stores values against the index.

Obviously, there are simpler ways to create a vector instead of just throwing all the data into its `apply`

method.

##### Creating a zero vector

Calling the vector's `zeros`

function would create a zero vector. While the numeric types would return a `0`

, the object types would return `null`

and the Boolean types would return `false`

:

val denseZeros=DenseVector.zeros[Double](5) //DenseVector(0.0, 0.0, 0.0, 0.0, 0.0) val sparseZeros=SparseVector.zeros[Double](5) //SparseVector()

Not surprisingly, the `SparseVector`

does not allocate any memory for the contents of the vector. However, the creation of the `SparseVector`

object itself is accounted for in the memory.

#### Creating a vector out of a function

The `tabulate`

function in vector is an interesting and useful function. It accepts a size argument just like the `zeros`

function but it also accepts a function that we could use to populate the values for the vector. The function could be anything ranging from a random number generator to a naïve index based generator, which we have implemented here. Notice how the return value of the function (`Int`

) could be converted into a vector of `Double`

by using the `type`

parameter:

val denseTabulate=DenseVector.tabulate[Double](5)(index=>index*index) //DenseVector(0.0, 1.0, 4.0, 9.0, 16.0)

#### Creating a vector of linearly spaced values

The `linspace`

function in `breeze.linalg`

creates a new `Vector[Double]`

of linearly spaced values between two arbitrary numbers. Not surprisingly, it accepts three arguments—the start, end, and the total number of values that we would like to generate. Please note that the start and the end values are inclusive while being generated:

val spaceVector=breeze.linalg.linspace(2, 10, 5) //DenseVector(2.0, 4.0, 6.0, 8.0, 10.0)

#### Creating a vector with values in a specific range

The `range`

function in a vector has two variants. The plain vanilla function accepts a start and end value (start inclusive):

val allNosTill10=DenseVector.range(0, 10) //DenseVector(0, 1, 2, 3, 4, 5, 6, 7, 8, 9)

The other variant is an overloaded function that accepts a "step" value:

val evenNosTill20=DenseVector.range(0, 20, 2) // DenseVector(0, 2, 4, 6, 8, 10, 12, 14, 16, 18)

Just like the `range`

function, which has all the arguments as integers, there is also a `rangeD`

function that takes the start, stop, and the step parameters as `Double`

:

val rangeD=DenseVector.rangeD(0.5, 20, 2.5) // DenseVector(0.5, 3.0, 5.5, 8.0, 10.5, 13.0, 15.5)

#### Creating an entire vector with a single value

Filling an entire vector with the same value is child's play. We just say HOW BIG is this vector going to be and then WHAT value. That's it.

val denseJust2s=DenseVector.fill(10, 2) // DenseVector(2, 2, 2, 2, 2, 2 , 2, 2, 2, 2)

#### Slicing a sub-vector from a bigger vector

Choosing a part of the vector from a previous vector is just a matter of calling the slice method on the bigger vector. The parameters to be passed are the start index, end index, and an optional "step" parameter. The step parameter adds the step value for every iteration until it reaches the end index. Note that the end index is excluded in the sub-vector:

val allNosTill10=DenseVector.range(0, 10) //DenseVector(0, 1, 2, 3, 4, 5, 6, 7, 8, 9) val fourThroughSevenIndexVector= allNosTill10.slice(4, 7) //DenseVector(4, 5, 6) val twoThroughNineSkip2IndexVector= allNosTill10.slice(2, 9, 2) //DenseVector(2, 4, 6)

#### Creating a Breeze Vector from a Scala Vector

A Breeze vector object's `apply`

method could even accept a Scala Vector as a parameter and construct a vector out of it:

val vectFromArray=DenseVector(collection.immutable.Vector(1,2,3,4)) // DenseVector(Vector(1, 2, 3, 4))

#### Vector arithmetic

Now let's look at the basic arithmetic that we could do on vectors with scalars and vectors.

#### Scalar operations

Operations with scalars work just as we would expect, propagating the value to each element in the vector.

Adding a scalar to each element of the vector is done using the `+`

function (surprise!):

val inPlaceValueAddition=evenNosTill20 +2 //DenseVector(2, 4, 6, 8, 10, 12, 14, 16, 18, 20)

Similarly the other basic arithmetic operations—subtraction, multiplication, and division involves calling the respective functions named after the universally accepted symbols (`-`

, `*`

, and `/`

):

//Scalar subtraction val inPlaceValueSubtraction=evenNosTill20 -2 //DenseVector(-2, 0, 2, 4, 6, 8, 10, 12, 14, 16) //Scalar multiplication val inPlaceValueMultiplication=evenNosTill20 *2 //DenseVector(0, 4, 8, 12, 16, 20, 24, 28, 32, 36) //Scalar division val inPlaceValueDivision=evenNosTill20 /2 //DenseVector(0, 1, 2, 3, 4, 5, 6, 7, 8, 9)

#### Calculating the dot product of two vectors

Each vector object has a function called `dot`

, which accepts another vector of the same length as a parameter.

Let's fill in just `2s`

to a new vector of length `5`

:

val justFive2s=DenseVector.fill(5, 2) //DenseVector(2, 2, 2, 2, 2)

We'll create another vector from `0`

to `5`

with a step value of `1`

(a fancy way of saying `0`

through `4`

):

val zeroThrough4=DenseVector.range(0, 5, 1) //DenseVector(0, 1, 2, 3, 4)

Here's the `dot`

function:

val dotVector=zeroThrough4.dot(justFive2s) //Int = 20

It is to be expected of the function to complain if we pass in a vector of a different length as a parameter to the dot product - Breeze throws an `IllegalArgumentException`

if we do that. The full exception message is:

Java.lang.IllegalArgumentException: Vectors must be the same length!

#### Creating a new vector by adding two vectors together

The `+`

function is overloaded to accept a vector other than the scalar we saw previously. The operation does a corresponding element-by-element addition and creates a new vector:

val evenNosTill20=DenseVector.range(0, 20, 2) //DenseVector(0, 2, 4, 6, 8, 10, 12, 14, 16, 18) val denseJust2s=DenseVector.fill(10, 2) //DenseVector(2, 2, 2, 2, 2, 2, 2, 2, 2, 2) val additionVector=evenNosTill20 + denseJust2s // DenseVector(2, 4, 6, 8, 10, 12, 14, 16, 18, 20)

There's an interesting behavior encapsulated in the addition though. Assuming you try to add two vectors of different lengths, if the first vector is smaller and the second vector larger, the resulting vector would be the size of the first vector and the rest of the elements in the second vector would be ignored!

val fiveLength=DenseVector(1,2,3,4,5) //DenseVector(1, 2, 3, 4, 5) val tenLength=DenseVector.fill(10, 20) //DenseVector(20, 20, 20, 20, 20, 20, 20, 20, 20, 20) fiveLength+tenLength //DenseVector(21, 22, 23, 24, 25)

On the other hand, if the first vector is larger and the second vector smaller, it would result in an `ArrayIndexOutOfBoundsException`

:

tenLength+fiveLength // java.lang.ArrayIndexOutOfBoundsException: 5

#### Appending vectors and converting a vector of one type to another

Let's briefly see how to append two vectors and convert vectors of one numeric type to another.

#### Concatenating two vectors

There are two variants of concatenation. There is a `vertcat`

function that just vertically concatenates an arbitrary number of vectors—the size of the vector just increases to the sum of the sizes of all the vectors combined:

val justFive2s=DenseVector.fill(5, 2) //DenseVector(2, 2, 2, 2, 2) val zeroThrough4=DenseVector.range(0, 5, 1) //DenseVector(0, 1, 2, 3, 4) val concatVector=DenseVector.vertcat(zeroThrough4, justFive2s) //DenseVector(0, 1, 2, 3, 4, 2, 2, 2, 2, 2)

No surprise here. There is also the `horzcat`

method that places the second vector horizontally next to the first vector, thus forming a matrix.

val concatVector1=DenseVector.horzcat(zeroThrough4, justFive2s)//breeze.linalg.DenseMatrix[Int]0 21 22 23 24 2

### Note

While dealing with vectors of different length, the `vertcat`

function happily arranges the second vector at the bottom of the first vector. Not surprisingly, the `horzcat`

function throws an exception:

`java.lang.IllegalArgumentException`

, meaning all vectors must be of the same size!

##### Converting a vector of Int to a vector of Double

The conversion of one type of vector into another is not automatic in Breeze. However, there is a simple way to achieve this:

val evenNosTill20Double=breeze.linalg.convert(evenNosTill20, Double)

##### Computing basic statistics

Other than the creation and the arithmetic operations that we saw previously, there are some interesting summary statistics operations that are available in the library. Let's look at them now:

### Note

Needs import of `breeze.linalg._`

and `breeze.numerics._`

. The operations in the Other operations section aim to simulate the NumPy's `UFunc`

or universal functions.

Now, let's briefly look at how to calculate some basic summary statistics for a vector.

##### Mean and variance

Calculating the mean and variance of a vector could be achieved by calling the `meanAndVariance`

universal function in the `breeze.stats`

package. Note that this needs a vector of `Double`

:

meanAndVariance(evenNosTill20Double) //MeanAndVariance(9.0,36.666666666666664,10)

### Note

As you may have guessed, converting an `Int`

vector to a `Double`

vector and calculating the mean and variance for that vector could be merged into a one-liner:

meanAndVariance(convert(evenNosTill20, Double))

#### Standard deviation

Calling the
`stddev`

on a `Double`

vector could give the standard deviation:

stddev(evenNosTill20Double) //Double = 6.0553007081949835

#### Find the largest value in a vector

The `max`

universal function inside the `breeze.linalg`

package would help us find the maximum value in a vector:

val intMaxOfVectorVals=max (evenNosTill20) //18

#### Finding the sum, square root and log of all the values in the vector

The same as with
`max`

, the `sum`

universal function inside the `breeze.linalg`

package calculates the `sum`

of the vector:

val intSumOfVectorVals=sum (evenNosTill20) //90

The functions `sqrt`

, `log`

, and various other universal functions in the `breeze.numerics`

package calculate the square root and log values of all the individual elements inside the vector:

##### The Sqrt function

val sqrtOfVectorVals= sqrt (evenNosTill20) // DenseVector(0.0, 1. 4142135623730951, 2.0, 2.449489742783178, 2.8284271247461903, 3.16227766016 83795, 3.4641016151377544, 3.7416573867739413, 4.0, 4.242640687119285)

##### The Log function

val log2VectorVals=log(evenNosTill20) // DenseVector(-Infinity , 0.6931471805599453, 1.3862943611198906, 1.791759469228055, 2.079441541679 8357, 2.302585092994046, 2.4849066497880004, 2.6390573296152584, 2.77258872 2239781, 2.8903717578961645)