Book Image

Modern Scala Projects

By : Ilango gurusamy
Book Image

Modern Scala Projects

By: Ilango gurusamy

Overview of this book

Scala is both a functional programming and object-oriented programming language designed to express common programming patterns in a concise, readable, and type-safe way. Complete with step-by-step instructions, Modern Scala Projects will guide you in exploring Scala capabilities and learning best practices. Along the way, you'll build applications for professional contexts while understanding the core tasks and components. You’ll begin with a project for predicting the class of a flower by implementing a simple machine learning model. Next, you'll create a cancer diagnosis classification pipeline, followed by tackling projects delving into stock price prediction, spam filtering, fraud detection, and a recommendation engine. The focus will be on application of ML techniques that classify data and make predictions, with an emphasis on automating data workflows with the Spark ML pipeline API. The book also showcases the best of Scala’s functional libraries and other constructs to help you roll out your own scalable data processing frameworks. By the end of this Scala book, you’ll have a firm foundation in Scala programming and have built some interesting real-world projects to add to your portfolio.
Table of Contents (14 chapters)
Title Page
Copyright and Credits
Packt Upsell
Contributors
Preface
Index

Getting started


In this section, we will talk about setting up an implementation infrastructure or using the existing infrastructure from previous chapters. The following upgrades to your infrastructure are optional but recommended. 

Starting in Chapter 3, Stock Price Predictions, we set up the Hortonworks Development Platform (HDP) Sandbox as a virtual machine. That said, three kinds of (isolated) HDP Sandbox deployments are possible. Of the three, we will only talk about two of them and those are:

  • Virtual machine environment (with Hypervisor) for Sandbox deployment: HDP Sandbox running in an Oracle VirtualBox virtual machine.
  • A cloud-based environment for Sandbox deployment: This option is attractive for users that have host machine memory limitations. The Sandbox runs in the cloud as opposed to a virtual machine that runs on your host machine.

With that opening point made, you can always run the fraud detection system code on the Spark shell. You have two options here:

  • Use Simple Build Tool...