Book Image

Scala Programming Projects

By : Mikael Valot, Nicolas Jorand
Book Image

Scala Programming Projects

By: Mikael Valot, Nicolas Jorand

Overview of this book

Scala Programming Projects is a comprehensive project-based introduction for those who are new to Scala. Complete with step-by-step instructions and easy-to-follow tutorials that demonstrate best practices when building applications, this Scala book will have you building real-world projects in no time. Starting with the fundamentals of software development, you’ll begin with simple projects, such as developing a financial independence calculator, and then advance to more complex projects, such as a building a shopping application and a Bitcoin transaction analyzer. You’ll explore a variety of Scala features, including its OOP and FP capabilities, and learn how to write concise, reactive, and concurrent applications in a type-safe manner. You’ll also understand how to use libraries such as Akka and Play. Furthermore, you’ll be able to integrate your Scala apps with Kafka, Spark, and Zeppelin, along with deploying applications on a cloud platform. By the end of the book, you’ll have a firm foundation in Java programming that’ll enable you to solve a variety of real-world problems, and you’ll have built impressive projects to add to your professional portfolio.
Table of Contents (18 chapters)
Title Page
Copyright and Credits
Packt Upsell
Contributors
Preface
Index

Using the Scala Console and Worksheet


By now, all the necessary tools and libraries should be installed. Let's start to play with the basics of Scala by experimenting in different environments. The simplest way to try Scala is to use the Scala Console. Subsequently, we will introduce the Scala Worksheet, which allows you to keep all the instructions that are entered in a file.

Using the Scala Console

The Scala console, also called Scala REPL (short for Read-Eval-Print-Loop), allows you to execute bits of code without having to compile them beforehand. It is a very convenient tool to experiment with the language or when you want to explore the capabilities of a library.

In the console, type 1+1 after the scala> prompt and hit Ctrl + Enterorcmd + Enter:

scala> 1+1

The console displays the result of the evaluation, like so:

res0: Int = 2

What happened here? The REPL compiled, evaluated the expression 1+1, and automatically assigned it to a variable named res0. This variable is of type Int, and its value is 2.

Declaring variables

In Scala, a variable can be declared using val or var. A val is immutable, which means you can never change its value. A var is mutable. It is not mandatory to declare the type of the variable. If you do not declare it, Scala will infer it for you.

Let's define some immutable variables:

Note

In all the following code examples, you only need to type the code that is after the Scala Command Prompt, and hit Ctrl + Enter or cmd + return to evaluate. We show the result of the evaluation underneath the prompt, as it would appear on your screen.

scala> val x = 1 + 1
x: Int = 2

scala> val y: Int = 1 + 1
y: Int = 2

In both cases, the type of the variable is Int. The type of x was inferred by the compiler to be Int. The type of y was explicitly specified with : Int after the name of the variable.

We can define a mutable variable and modify it as follows:

scala> var x = 1
x: Int = 1

scala> x = 2
x: Int = 2

It is a good practice to use val in most situations. Whenever I see a val declared, I know that its content will never change subsequently. It helps to reason about a program, especially when multiple threads are running. You can share an immutable variable across multiple threads without fearing that one thread might see a different value at some point. Whenever you see a Scala program using var, it should make you raise an eyebrow: the programmer should have a good reason to use a mutable variable, and it should be documented.

If we attempt to modify a val, the compiler will raise an error message:

scala> val y = 1
y: Int = 1

scala> y = 2
<console>:12: error: reassignment to val
       y = 2
         ^

This is a good thing: the compiler helps us make sure that no piece of code can ever modify a val.

Types

We saw in the previous examples that Scala expressions have a type. For instance, the value1 is of type Int, and the expression 1+1 is also of type Int. A type is a classification of data and provides a finite or infinite set of values. An expression of a given type can take any of its provided values.

Here are a few examples of types available in Scala:

  • Int provides a finite set of values, which are all the integers between -231 and 231-1.
  • Boolean provides a finite set of two values: true and false.
  • Double provides a finite set of values: all the 64 bits and IEEE-754 floating point numbers.
  • String provides an infinite set of values: all the sequence of characters are of an arbitrary length. For instance, "Hello World" or "Scala is great !".

A type determines the operations that can be performed on the data. For instance, you can use the + operator with two expressions of type Int or String, but not with expressions of type Boolean:

scala> val str = "Hello" + "World"
str: String = HelloWorld

scala> val i = 1 + 1
i: Int = 2

scala> val b = true + false
<console>:11: error: type mismatch;
 found   : Boolean(false)

When we attempt to use an operation on a type that does not support it, the Scala compiler complains of a type mismatch error.

An important feature of Scala is that it is a statically typed language. This means that the type of a variable or expression is known at compile time. The compiler will also check that you do not call an operation or function that is not legal for this type. This helps tremendously to reduce the number of bugs that can occur at runtime (when running a program).

As we saw earlier, the type of an expression can be specified explicitly with : followed by the name of the type, or in many cases, it can be automatically inferred by the compiler.

If you are not used to working with statically typed languages, you might get frustrated to have to fight with the compiler to make it accept your code, but you will gradually get more accustomed to the kind of errors thrown at you and how to resolve them. You will soon find that the compiler is not an enemy that prevents you from running your code; it is acting more like a good friend that shows you what logical errors you have made and gives you some indication on how to resolve them.

People coming from dynamically typed languages such as Python, or people coming from not-as-strongly statically typed language such as Java or C++, are often astonished to see that a Scala program that compiles has a much higher probability of being correct on the first run.

Note

IntelliJ can automatically add the inferred type to your definitions. For instance, type val a = 3 in the Scala console, then move the cursor at the beginning of the a. You should see a light bulb icon. When you click on it, you will see a hint add type annotation to value definition. Click on it, and IntelliJ will add : Int after the a. Your definition will become val a: Int = 3.

Declaring and calling functions

A Scala function takes 0 to n parameters and returns a value. The type of each parameter must be declared. The type of the returned value is optional, as it is inferred by the Scala compiler when not specified. However, it is a good practice to always specify the return type, as it makes the code more readable:

scala> def presentation(name: String, age: Int): String = 
  "Hello, my name is " + name + ". I am " + age + " years old."
presentation: (name: String, age: Int)String

scala> presentation(name = "Bob", age = 25)
res1: String = Hello, my name is Bob. I am 25 years old.

scala> presentation(age = 25, name = "Bob")

res2: String = Hello, my name is Bob. I am 25 years old.

We can call a function by passing arguments in the right order, but we can also name the arguments and pass them in any order. It is a good practice to name the arguments when some of them have the same type, or when a function takes many arguments. It avoids passing the wrong argument and improves readability.

Side effects

A function or expression is said to have a side effect when it modifies some state or has some action in the outside world. For instance, printing a string to the console, writing to a file, and modifying a var, are all side effects.

In Scala, all expressions have a type. A statement which performs a side effect is of type Unit. The only value provided by the type Unit is ():

scala> val x = println("hello")
hello
x: Unit = ()

scala> def printName(name: String): Unit = println(name)
printName: (name: String)Unit

scala> val y = {
  var a = 1
  a = a+1
}
y: Unit = ()

scala> val z = ()
z: Unit = ()

A pure function is a function whose result depends only on its arguments, and that does not have any observable side effect. Scala allows you to mix side-effecting code with pure code, but it is a good practice to push side-effecting code to the boundaries of your application. We will talk about this later in more detail in the Ensuring referential transparency section in Chapter 3, Handling Errors.

Note

Good practice: When a function with no parameters has side effects, you should declare it and call it with empty brackets (). It informs users of your function that it has side effects. Conversely, a pure function with no parameters should not have empty brackets, and should not be called with empty brackets. IntelliJ helps you in keeping some consistency: it will display a warning if you call a parameterless function with (), or if you omit the () when you call a function declared with ().

Here is an example of a method call with a side effect where we have to use empty brackets, and an example of a pure function:

scala> def helloWorld(): Unit = println("Hello world")
helloWorld: ()Unit

scala> helloWorld()
Hello world

scala> def helloWorldPure: String = "Hello world"
helloWorldPure: String

scala> val x = helloWorldPure
x: String = Hello world

If...else expression

In Scala, if (condition) ifExpr else if ifExpr2 else elseExpr is an expression, and has a type. If all sub-expressions have a type A, the type of the if ... else expression will be A as well:

scala> def agePeriod(age: Int): String = {
  if (age >= 65)
    "elderly"
  else if (age >= 40 && age < 65)
    "middle aged"
  else if (age >= 18 && age < 40)
    "young adult"
  else
    "child"
}
agePeriod: (age: Int)String

 If sub-expressions have different types, the compiler will infer a common super-type, or widen the type if it is a numeric type:

scala> val ifElseWiden = if (true) 2: Int else 2.0: Double
ifElseWiden: Double = 2.0

scala> val ifElseSupertype = if (true) 2 else "2"
ifElseSupertype: Any = 2

In the first expression present in the preceding code, the first sub-expression is of type Int and the second is of type Double. The type of ifElseWiden is widened to be Double.

In the second expression, the type of ifElseSupertype is Any, which is the common super-type for Int and String.

An if without an else is equivalent to if (condition) ifExpr else (). It is better to always specify the else expression, otherwise, the type of the if/else expression might not be the one we expect:

scala> val ifWithoutElse = if (true) 2
ifWithoutElse: AnyVal = 2

scala> val ifWithoutElseExpanded = if (true) 2: Int else (): Unit
ifWithoutElseExpanded: AnyVal = 2

scala> def sideEffectingFunction(): Unit = if (true) println("hello world")
sideEffectingFunction: ()Unit

In the preceding code, the common super-type between Int and Unit is AnyVal. This can be a bit surprising. In most situations, you would want to avoid that. 

Class

We mentioned earlier that all Scala expressions have a type. A class is a sort of template that can create objects of a specific type. When we want to obtain a value of a certain type, we can instantiate a new object using new followed by the class name:

scala> class Robot
defined class Robot

scala> val nao = new Robot
nao: Robot = Robot@78318ac2

The instantiation of an object allocates a portion of heap memory in the JVM. In the preceding example, the value nao is actually a reference to the portion of heap memory that keeps the content of our new Robot object. You can observe that when the Scala console printed the variable nao, it outputted the name of the class, followed by @78318ac2. This hexadecimal number is, in fact, the memory address of where the object is stored in the heap.

The eq operator can be handy to check if two references are equal. If they are equal, this means that they point to the same portion of memory:

scala> val naoBis = nao
naoBis: Robot = Robot@78318ac2

scala> nao eq naoBis
res0: Boolean = true

scala> val johnny5 = new Robot
johnny5: Robot = Robot@6b64bf61

scala> nao eq johnny5
res1: Boolean = false

A class can have zero to many members. A member can be either:

  • An attribute, also called a field. It is a variable whose content is unique to each instance of the class.
  • A method. This is a function that can read and/or write the attributes of the instance. It can have additional parameters.

Here is a class that defines a few members:

scala> class Rectangle(width: Int, height: Int) {
  val area: Int = width * height
  def scale(factor: Int): Rectangle = new Rectangle(width * factor, height * factor)
}
defined class Rectangle

The attributes declared inside the brackets () are a bit special: they are constructor arguments, which means that their value must be specified when we instantiate a new object of the class. The other members must be defined inside the curly brackets {}. In our example, we defined four members:

  • Two attributes that are constructor arguments: width and height.
  • One attribute, area. Its value is defined when an instance is created by using the other attributes.
  • One method, scale, which uses the attributes to create a new instance of the class Rectangle.

You can call a member on an instance of a class by using the postfix notation myInstance.member. Let's create a few instances of our class and try to call the members:

scala> val square = new Rectangle(2, 2)
square: Rectangle = Rectangle@2af9a5ef

scala> square.area
res0: Int = 4

scala> val square2 = square.scale(2)
square2: Rectangle = Rectangle@8d29719

scala> square2.area
res1: Int = 16

scala> square.width
<console>:13: error: value width is not a member of Rectangle
       square.width

We can call the members area and scale, but not width. Why is that?

This is because, by default, constructor arguments are not accessible from the outside world. They are private to the instance and can only be accessed from the other members. If you want to make the constructor arguments accessible, you need to prefix them with val:

scala> class Rectangle(val width: Int, val height: Int) {
  val area: Int = width * height
  def scale(factor: Int): Rectangle = new Rectangle(width * factor, height * factor)
}
defined class Rectangle

scala> val rect = new Rectangle(3, 2)
rect: Rectangle = Rectangle@3dbb7bb

scala> rect.width
res3: Int = 3

scala> rect.height
res4: Int = 2

This time, we can get access to the constructor arguments. Note that you can declare attributes using var instead of val. This would make your attribute modifiable. However, in functional programming, we avoid mutating variables. A var attribute in a class is something that should be used cautiously in specific situations. An experienced Scala programmer would flag it immediately in a code review and its usage should be always justified in a code comment.

If you need to modify an attribute, it is better to return a new instance of the class with the modified attribute, as we did in the preceding Rectangle.scale method.

Note

You might worry that all these new objects will consume too much memory. Fortunately, the JVM has a mechanism known as the garbage collector. It automatically frees up the memory used by objects that are not referenced by any variable.

Using the worksheet

IntelliJ offers another handy tool to experiment with the language: the Scala worksheet.

Go to File | New | Scala Worksheet. Name it worksheet.sc. You can then enter some code on the left-hand side of the screen. A red/green indicator in the top right corner shows you if the code you are typing is valid or not. As soon as it compiles, the results appear on the right-hand side:

 

You will notice that nothing gets evaluated until your whole worksheet compiles.

Class inheritance

Scala classes are extensible. You can extend an existing class to inherit from all its members. If B extends A, we say that B is a subclass of A, a derivation of B, or a specialization of B. A is a superclass of B or a generalization of B.

Let's see how it works in an example. Type the following code in the worksheet:

class Shape(val x: Int, val y: Int) {
val isAtOrigin: Boolean = x == 0 && y == 0
}

class Rectangle(x: Int, y: Int, val width: Int, val height: Int)
extends Shape(x, y)

class Square(x: Int, y: Int, width: Int)
extends Rectangle(x, y, width, width)

class Circle(x: Int, y: Int, val radius: Int)
extends Shape(x, y)

val rect = new Rectangle(x = 0, y = 3, width = 3, height = 2)
rect.x
rect.y
rect.isAtOrigin
rect.width
rect.height

The classes Rectangle and Circle are subclasses of Shape. They inherit from all the members of Shape: x, y, and isAtOrigin. This means that when I instantiate a new Rectangle, I can call members declared in Rectangle, such as width and height, and I can also call members declared in Shape.

When declaring a subclass, you need to pass the constructor arguments of the superclass, as if you were instantiating it. As Shape declares two constructor parameters, x and y, we have to pass them in the declaration extends Shape(x, y). In this declaration, x and y are themselves the constructor arguments of Rectangle. We just passed these arguments up the chain.

Notice that in the subclasses, the constructor parameters x and y are declared without val. If we had declared them with val, they would have been promoted as publicly available attributes. The problem is that Shape also has x and y as public attributes. In this situation, the compiler would have raised a compilation error to highlight the conflict.

Subclass assignment

Consider two classes, A and B, with B extends A.

When you declare a variable of type A, you can assign it to an instance of B, with val a: A = new B.

On the other hand, if you declare a variable of type B, you cannot assign it to an instance of A.

Here is an example that uses the same Shape and Rectangle definitions that were described earlier:

val shape: Shape = new Rectangle(x = 0, y = 3, width = 3, height = 2)
val rectangle: Rectangle = new Shape(x = 0, y = 3)

The first line compiles because Rectangleis aShape.

The second line does not compile, because not all shapes are rectangles.

Overriding methods

When you derive a class, you can override the members of the superclass to provide a different implementation. Here is an example that you can retype in a new worksheet:

class Shape(val x: Int, val y: Int) {
def description: String = s"Shape at (" + x + "," + y + ")"
}

class Rectangle(x: Int, y: Int, val width: Int, val height: Int)
extends Shape(x, y) {
override def description: String = {
super.description + s" - Rectangle " + width + " * " + height
  }
}

val rect = new Rectangle(x = 0, y = 3, width = 3, height = 2)
rect.description

When you run the worksheet, it evaluates and prints the following description on the right-hand side:

res0: String = Shape at (0,3) - Rectangle 3 * 2

We defined a method description on the class Shape that returns a String. When we call rect.description, the method called is the one defined in the class Rectangle, because Rectangle overrides the method description with a different implementation.

The implementation of description in the class Rectangle refers to super.description. super is a keyword that lets you use the members of the superclass without taking into account any overriding. In our case, this was necessary so that we could use the super reference, otherwise, description would have called itself in an infinite loop!

On the other hand, the keyword this allows you to call the members of the same class. Change Rectangle to add the following methods:

class Rectangle(x: Int, y: Int, val width: Int, val height: Int)
extends Shape(x, y) {
override def description: String = {
super.description + s" - Rectangle " + width + " * " + height
  }

def descThis: String = this.description
def descSuper: String = super.description
}

val rect = new Rectangle(x = 0, y = 3, width = 3, height = 2)
rect.description
rect.descThis
rect.descSuper

When you evaluate the worksheet, it prints the following strings:

res0: String = Shape at (0,3) - Rectangle 3 * 2
res1: String = Shape at (0,3) - Rectangle 3 * 2
res2: String = Shape at (0,3)

The call to this.description used the definition of description, as declared in the class Rectangle, whereas the call to super.description used the definition of description, as declared in the class Shape.

Abstract class

An abstract class is a class that can have many abstract members. An abstract member defines only a signature for an attribute or a method, without providing any implementation. You cannot instantiate an abstract class: you must create a subclass that implements all the abstract members.

Replace the definition of Shape and Rectangle in the worksheet as follows:

abstract class Shape(val x: Int, val y: Int) {
val area: Double
def description: String
}

class Rectangle(x: Int, y: Int, val width: Int, val height: Int)
extends Shape(x, y) {

val area: Double = width * height

def description: String =
"Rectangle " + width + " * " + height
}

Our class Shape is now abstract. We cannot instantiate a Shape class directly anymore: we have to create an instance of Rectangle or any of the other subclasses of Shape. Shape defines two concrete members, x and y, and two abstract members, area and description. The subclass, Rectangle, implements the two abstract members.

Note

You can use the prefix override when implementing an abstract member, but it is not necessary. I recommend not adding it to keep the code less cluttered. Also, if you subsequently implement the abstract method in the superclass, the compiler will help you find all subclasses that had an implementation. It will not do this if they use override

Trait

A trait is similar to an abstract class: it can declare several abstract or concrete members and can be extended. It cannot be instantiated. The difference is that a given class can only extend one abstract class, however, it can mixin one to many traits. Also, a trait cannot have constructor arguments.

For instance, we can declare several traits, each declaring different abstract methods, and mixin them all in the Rectangle class:

trait Description {
def description: String
}

trait Coordinates extends Description {
def x: Int
def y: Int

def description: String =
"Coordinates (" + x + ", " + y + ")"
}

trait Area {
def area: Double
}

class Rectangle(val x: Int,
val y: Int,
val width: Int,
val height: Int)
extends Coordinates with Description with Area {

val area: Double = width * height

override def description: String =
super.description + " - Rectangle " + width + " * " + height
}

val rect = new Rectangle(x = 0, y = 3, width = 3, height = 2)
rect.description

The following string gets printed when evaluating rect.description:

res0: String = Coordinates (0, 3) - Rectangle 3 * 2

The class Rectangle mixes in the traits Coordinates, Description, and Area. We need to use the keyword extends before trait or class, and the keyword with for all subsequent traits.

Notice that the Coordinates trait also mixes the Description trait, and provides a default implementation. As we did when we had a Shape class, we override this implementation in Rectangle, and we can still call super.description to refer to the implementation of description in the trait Coordinates.

Another interesting point is that you can implement an abstract method with val – in trait Area, we defined def area: Double, and implemented it in Rectangle using val area: Double. It is a good practice to define abstract members with def. This way, the implementer of the trait can decide whether to define it by using a method or a variable.

Scala class hierarchy

All Scala types extend a built-in type called Any. This type is the root of the hierarchy of all Scala types. It has two direct subtypes:

  • AnyVal is the root class of all value types. These types are represented as primitive types in the JVM.
  • AnyRef is the root class of all object types. It is an alias for the class java.lang.Object.
  • A variable of type AnyVal directly contains the value, whereas a variable of type AnyRef contains the address of an object stored somewhere in memory.

The following diagram shows a partial view of this hierarchy:

When you define a new class, it indirectly extends AnyRef. This being an alias for java.lang.Object, your class inherits from all the default methods implemented in Object. Its most important methods are as follows:

  • def toString: String returns a string representation of an object. This method is called whenever you print an object using println. The default implementation returns the class's name followed by the address of the object in memory.
  • def equals(obj: Object): Boolean returns true if the object is equal to another object, and false otherwise. This method is called whenever you compare two objects using ==. The default implementation only compares the objects' references, and hence is equivalent to eq. Fortunately, most classes from the Java and Scala SDK override this method to provide a good comparison. For instance, the class java.lang.String overrides the equals method to compare the content of the strings, character by character. Therefore, when you compare two strings with ==, the result will be true if the strings are the same, even if they are stored in different places in memory.
  • def hashCode: Int is called whenever you put an object in Set or if you use it as a key in Map. The default implementation is based on the address of the object. You can override this method if you want to have a better distribution of the data in Set or Map, which can improve the performance of these collections. However, if you do so, you must make sure that hashCode is consistent with equals: if two objects are equal, their hashCodes must also be equal.

It would be very tedious to have to override these methods for all your classes. Fortunately, Scala offers a special construct called case class that will automatically override these methods for us.

Case class

In Scala, we define most data structures using case classes. case class has one to many immutable attributes and provides several built-in functions compared to a standard class.

Type the following into the worksheet:

case class Person(name: String, age: Int)
val mikaelNew = new Person("Mikael", 41)
// 'new' is optional
val mikael = Person("Mikael", 41)
// == compares values, not references
mikael == mikaelNew
// == is exactly the same as .equals
mikael.equals(mikaelNew)

val name = mikael.name

// a case class is immutable. The line below does not compile:
//mikael.name = "Nicolas"
// you need to create a new instance using copy
val nicolas = mikael.copy(name = "Nicolas")

In the preceding code, the text following // is a comment that explains the preceding statement.

When you declare a class as case class, the Scala compiler automatically generates a default constructor, an equals and hashCode method, a copy constructor, and an accessor for each attribute.

Here is a screenshot of the worksheet we have. You can see the results of the evaluations on the right-hand side:

Companion object

A class can have a companion object. It must be declared in the same file as the class, using the keyword object followed by the name of the class it is accompanying. A companion object is a singleton – there is only one instance of this object in the JVM. It has its own type and is not an instance of the accompanied class.

This object defines static functions or values that are closely related to the class it is accompanying. If you are familiar with Java, it replaces the keyword static: in Scala, all static members of a class are declared inside the companion object.

Some functions in the companion object have a special meaning. Functions named apply are constructors of the class. The name apply can be omitted when we call them:

case class City(name: String, urbanArea: Int)
object City {
  val London = City("London", 1738)
  val Lausanne = City("Lausanne", 41)
}

case class Person(firstName: String, lastName: String, city: City)
object Person {
  def apply(fullName: String, city: City): Person = {
    val splitted = fullName.split(" ")
    new Person(firstName = splitted(0), lastName = splitted(1), city = city)
  }
}

// Uses the default apply method
val m1 = Person("Mikael", "Valot", City.London)
// Call apply with fullName
val m2 = Person("Mikael Valot", City.London)
// We can omit 'apply'
val n = Person.apply("Nicolas Jorand", City.Lausanne)

In the preceding code, we defined a companion object for the class City, which defines some constants. The convention for constants is to have the first letter in uppercase.

The companion object for the class Person defines an additional apply function that acts as a constructor. Its implementation calls the method split(" "), which splits a string separated by spaces to produce an array of type string. It allows us to construct a Person instance using a single string where the first name and last name are separated by a space. We then demonstrated that we can either call the default apply function that comes with the case class, or the one we implemented.