Book Image

Real Time Analytics with SAP Hana

By : Vinay Singh
Book Image

Real Time Analytics with SAP Hana

By: Vinay Singh

Overview of this book

SAP HANA is an in-memory database created by SAP. SAP HANA breaks traditional database barriers to simplify IT landscapes, eliminating data preparation, pre-aggregation, and tuning. SAP HANA and in-memory computing allow you to instantly access huge volumes of structured and unstructured data, including text data, from different sources. Starting with data modeling, this fast-paced guide shows you how to add a system to SAP HANA Studio, create a schema, packages, and delivery unit. Moving on, you’ll get an understanding of real-time replication via SLT and learn how to use SAP HANA Studio to perform this. We’ll also have a quick look at SAP Business Object DATA service and SAP Direct Extractor for Data Load. After that, you will learn to create HANA artifacts—Analytical Privileges and Calculation View. At the end of the book, we will explore the SMART DATA access option and AFL library, and finally deliver pre-packaged functionality that can be used to build information models faster and easier.
Table of Contents (16 chapters)
Real Time Analytics with SAP HANA
Credits
About the Author
About the Reviewers
www.PacktPub.com
Preface
Index

Row and column storage in SAP HANA


Relational databases typically use row-based data storage. SAP HANA uses both (row based and column based data storage)

  • The row storage: This stores records in a sequence of rows

  • The column storage: The column entries are stored in a continuous memory location

Before getting into a SAP HANA specific discussion, let's try to understand how different column storage is from row. The column-oriented database systems (in our case, SAP HANA) perform better than traditional row-oriented database systems on analytical tasks, in areas such as data warehouses, decision support, predictive analysis, and business intelligence applications.

The major reason behind this performance difference in these areas is that column stores are more I/O efficient for read-only queries as they only have to read the attributes accessed by a query from the disk or memory.

Let's see a few factors that optimize performance in the column storage:

  • Compression: The data stored in columns is...