Book Image

Introduction to R for Business Intelligence

By : Jay Gendron
Book Image

Introduction to R for Business Intelligence

By: Jay Gendron

Overview of this book

Explore the world of Business Intelligence through the eyes of an analyst working in a successful and growing company. Learn R through use cases supporting different functions within that company. This book provides data-driven and analytically focused approaches to help you answer questions in operations, marketing, and finance. In Part 1, you will learn about extracting data from different sources, cleaning that data, and exploring its structure. In Part 2, you will explore predictive models and cluster analysis for Business Intelligence and analyze financial times series. Finally, in Part 3, you will learn to communicate results with sharp visualizations and interactive, web-based dashboards. After completing the use cases, you will be able to work with business data in the R programming environment and realize how data science helps make informed decisions and develops business strategy. Along the way, you will find helpful tips about R and Business Intelligence.
Table of Contents (19 chapters)
Introduction to R for Business Intelligence
Credits
About the Author
Acknowledgement
About the Reviewers
www.PacktPub.com
Preface
References
R Packages Used in the Book
R Code for Supporting Market Segment Business Case Calculations

Chapter 2 - Data Cleaning


The following table describes some other useful functions for Chapter 2, Data Cleaning:

Function

Description

as.character(); as.factor(); as.logical()

These are the functions to perform type conversion, such as as.numeric() seen in Chapter 2, Data Cleaning.

class()

This is used on an R object to determine the data type that the object is stored as.

encoding()

This reveals the encoding scheme that is used in storing the object.

grepl(); grep()

These are the functions in the base R package that use regular expressions to find instances of a search string.

gsub()

This is the function in the base R package that allows for global substitution of a string with another.

impute(x, fun = mean)

This is a function to impute a vector of values based on the average of existing values. This is used with x[is.na(x)] to isolate missing values.

install.packages()

This downloads and installs packages that are not included in Base R from CRAN-like repositories.

is.null()

This finds NULL variables that are simply empty items in an observation.

nrow(); ncol()

These are used to determine the number of rows or columns in a data frame.

read.table()

This is the underlying function of all other read functions. It is very flexible and is useful in tailoring how you read in data.

readlines()

This is useful for reading data that is raw text that is not in a tabular format.

strsplit()

This is useful when manipulating text data by splitting a string into substrings based on a separator value provided.