Book Image

Big Data Analytics with SAS

Book Image

Big Data Analytics with SAS

Overview of this book

SAS has been recognized by Money Magazine and Payscale as one of the top business skills to learn in order to advance one’s career. Through innovative data management, analytics, and business intelligence software and services, SAS helps customers solve their business problems by allowing them to make better decisions faster. This book introduces the reader to the SAS and how they can use SAS to perform efficient analysis on any size data, including Big Data. The reader will learn how to prepare data for analysis, perform predictive, forecasting, and optimization analysis and then deploy or report on the results of these analyses. While performing the coding examples within this book the reader will learn how to use the web browser based SAS Studio and iPython Jupyter Notebook interfaces for working with SAS. Finally, the reader will learn how SAS’s architecture is engineered and designed to scale up and/or out and be combined with the open source offerings such as Hadoop, Python, and R. By the end of this book, you will be able to clearly understand how you can efficiently analyze Big Data using SAS.
Table of Contents (17 chapters)

Analytics


Analytics starts from simply understanding more about the data you are going to work with, such as how many variables are character versus numeric, and attributes related to the variables such as length, number of missing values, and number of unique values for any given column. Advanced analytics includes data mining, forecasting, and/or optimization. Data mining involves both descriptive analytics and predictive analytics, which can be used to segment entities into like groups, describing characteristics of groups and providing likelihood scores from 0 to 100% of an event or type of behavior for individual entities being analyzed. Forecasting tends to use a different set of mathematical algorithms to help determine the number of entities that will be needed within some future time range and/or when an event may occur in the future with a confidence rating between 0 to 100% on the results.

Optimization provides algorithms that help determine the maximum or minimum values of complex...