Book Image

Scala for Machine Learning

By : R. Nicolas
Book Image

Scala for Machine Learning

By: R. Nicolas

Overview of this book

Are you curious about AI? All you need is a good understanding of the Scala programming language, a basic knowledge of statistics, a keen interest in Big Data processing, and this book!
Table of Contents (15 chapters)
14
Index

CRF and text analytics

Most of the examples used to demonstrate the capabilities of conditional random fields are related to text mining, intrusion detection, or bioinformatics. Although these applications have a great commercial merit, they are not suitable as an introductory test case because they usually require a lengthy description of the model and the training process.

The feature functions model

For our example, we will select a simple problem: how to collect and aggregate an analyst's recommendation on any given stock from different sources with different formats.

Analysts at brokerage firms and investment funds routinely publish the list of recommendations or rating for any stock. These analysts used different rating schemes from buy/hold/sell; A, B, C rating; and stars rating to market perform/neutral/market underperform. For this example, the rating is normalized as follows:

  • 0 for a strong sell, (or F or 1 star rating)
  • 1 for sell (D, 2 stars, marker underperform)
  • 2 for neutral...