Book Image

Mastering SAS Programming for Data Warehousing

By : Monika Wahi
Book Image

Mastering SAS Programming for Data Warehousing

By: Monika Wahi

Overview of this book

SAS is used for various functions in the development and maintenance of data warehouses, thanks to its reputation of being able to handle ’big data’. This book will help you learn the pros and cons of storing data in SAS. As you progress, you’ll understand how to document and design extract-transform-load (ETL) protocols for SAS processes. Later, you’ll focus on how the use of SAS arrays and macros can help standardize ETL. The book will also help you examine approaches for serving up data using SAS and explore how connecting SAS to other systems can enhance the data warehouse user’s experience. By the end of this data management book, you will have a fundamental understanding of the roles SAS can play in a warehouse environment, and be able to choose wisely when designing your data warehousing processes involving SAS.
Table of Contents (18 chapters)
1
Section 1: Managing Data in a SAS Data Warehouse
7
Section 2: Using SAS for Extract-Transform-Load (ETL) Protocols in a Data Warehouse
12
Section 3: Using SAS When Serving Warehouse Data to Users

Summary

This chapter described many strategies and considerations to be made when developing code to read big data into SAS. SAS native data file formats *.SAS7bdat and XPT contain metadata that makes reading them into SAS easy and accurate. However, the downside is that these datasets can take up a lot of storage space.

LIBNAME statements in SAS point to external locations for reading and storing data. Using LIBNAME statements, the SAS user can convert *.csv and *.txt files to *.SAS7bdat datasets for storage, and can also convert them to XPT. While storing data in *.csv and *.txt format can conserve space, the downside is that oftentimes specialized infile code is necessary for reading this data in so that it is properly formatted in SAS. While the automation involved in PROC IMPORT can help with this, when regularly transferring large raw data files, usually, long and detailed infile code must ultimately be developed and maintained.

Because of SAS's ability to handle...