Book Image

SAP HANA Cookbook

Book Image

SAP HANA Cookbook

Overview of this book

SAP HANA is a real-time applications platform that provides a multi-purpose, in-memory appliance. Decision makers in the organization can gain instant insight into business operations. Thus all the data available can be analysed and you can react to the changing business conditions rapidly to make decisions. The real-time platform not only empowers business users and top management to make decisions but also provides the capability to make decisions in real-time.A practical and comprehensive guide that helps you understand the power of SAP HANA’s real-time and in-memory capabilities. It also provides step-by-step instructions to exploit all the possible features of the SAP HANA database, enabling users to harness the full potential of this technology and its features.You will gain an understanding of real-time replications, effective data loading from various sources, how to load data, and how to create re-usable objects such as models and reports.Use this practical guide to enable or transform your business landscape by implementing SAP HANA to meet your business requirements. The book shows you how to load data from different types of systems, create models in SAP HANA, and consume data for decision-making. The book covers various tools at different stages creating models using SAP HANA Studio, and consuming data using reporting tools such as SAP BusinessObjects, SAP Lumira, and so on . This book also explains the in-depth architecture of SAP HANA to help you understand SAP HANA as an appliance, that is, a combination of hardware and software.The book covers the best practices to leverage SAP HANA’s in-memory technology to transform data into insightful information. It also covers technology landscaping, solution architecture, connectivity, data loading, and setting up the environment for modeling purpose (including setup of SAP HANA Studio).If you have an intention to start your career as SAP HANA Modeler, this book is the perfect start.  
Table of Contents (16 chapters)
SAP HANA Cookbook
Credits
About the Authors
About the Reviewers
www.PacktPub.com
Preface
Index

Explaining IMCE and its components


The SAP in-memory computing (IMCE) engine (formerly Business Analytic Engine (BAE)) is the core engine for SAP's next generation high-performance, in-memory solutions. This is because it leverages technologies such as in-memory computing, columnar databases, massively parallel processing (MPP), and data compression to allow organizations to instantly explore and analyze large volumes of transactional and analytical data from across the enterprise in real time.

In-memory computing allows the processing of massive quantities of real-time data in the main memory of the server, providing immediate results from analyses and transactions. The SAP in-memory computing database delivers the following capabilities:

  • In-memory computing functionality with native support for row and columnar datastores providing full ACID (atomicity, consistency, isolation, and durability) transactional capabilities

  • Integrated lifecycle management capabilities and data integration capabilities to access SAP and non-SAP data sources

  • SAP IMCE Studio that includes tools for data modeling, data and life cycle management, and data security

The SAP IMCE that resides at the heart of SAP HANA is an integrated database and calculation layer that allows the processing of massive quantities of real-time data in the main memory to provide immediate results from analysis and transactions. Like any standard database, the SAP IMCE not only supports industry standards such as SQL and MDX, but also incorporates a high-performance calculation engine that embeds procedural language support directly into the database kernel. This approach is designed to remove the need to read data from the database, process it, and then write data back to the database, that is, process the data near the database and return the results.

The IMCE is an in-memory, column-oriented database technology. It is a powerful calculation engine at the heart of SAP HANA. As data resides in the Random Access Memory (RAM), highly accelerated performance can be achieved compared to systems that read data from disks. The heart lies within the IMCE, which allows us to create and perform calculations on data. SAP IMCE Studio includes tools for data modeling activities, data and life cycle management, and also tools that are related to data security. The following diagram shows the complete architecture of SAP HANA, including IMCE and how it is connected to different components:

The following diagram shows the components of IMCE alone:

Further reading

SAP HANA database has the following two engines:

  • Column-based store: This engine stores the huge amounts of relational data in column-optimized tables, which are aggregated and used in analytical operations.

  • Row-based store: This engine stores the relational data in rows, similar to the storage mechanism of traditional database systems. The row store is more optimized for write operations and has a lower compression rate. Also, the query performance is lower when compared to the column-based store.

The engine that is used to store data can be selected on a per-table basis at the time of creating a table. Tables in the row-based store are loaded at start up time. In the case of column-based stores, tables can be either loaded at start up or on demand, that is, during normal operation of the SAP HANA database.

Both engines share a common persistence layer, which provides data persistency that is consistent across both engines. Like a traditional database, we have page management and logging in SAP HANA. The changes made to the in-memory database pages are persisted through savepoints. These savepoints are written to those data volumes on the persistent storage for which the storage medium is hard drives. All transactions committed in the SAP HANA database are stored/saved/referenced by the logger of the persistency layer in a log entry written to the log volumes on the persistent storage. To get high I/O performance and low latency, log volumes use the flash technology storage.

The relational engines can be accessed through a variety of interfaces. The SAP HANA database supports SQL (JDBC/ODBC), MDX (ODBO), and BICS (SQLDBC). The calculation engine performs all the calculations in the database. No data moves into the application layer until calculations are completed. It also contains the business functions library that is called by applications to perform calculations based on the business rules and logic. The SAP HANA-specific SQL script language is an extension of SQL that can be used to push down data-intensive application logic into the SAP HANA database for specific requirements.

Session management

This component creates and manages sessions and connections for the database clients. When a session is created, a set of parameters are maintained in the backend by the system. These parameters include auto-commit settings and the current transaction isolation level. After establishing a session, database clients communicate with the SAP HANA database using SQL statements. SAP HANA database treats all the statements as transactions while processing them. Each new session created will be assigned to a new transaction.

Transaction manager

The transaction manager is the component that coordinates database transactions, takes care of controlling transaction isolation, and keeps track of running and closed transactions. The transaction manager informs the involved storage engines about the running or closed transactions, so that they can execute necessary actions when a transaction is committed or rolled back. The transaction manager cooperates with the persistence layer to achieve atomic and durable transactions.

Atomicity is one of the ACID transaction properties. In an atomic transaction, when a series of database operations are present, either all occur or nothing occurs. A guarantee of atomicity prevents updates to the database occurring only partially, which can cause greater problems than rejecting the whole series outright.

Durability guarantees the transactions that have committed will survive permanently. For example, if a flight booking software reports that a seat has successfully been booked, then the seat will remain booked even if the system crashes.

The client requests are analyzed and executed by a set of components summarized as request processing and execution control. The client requests are analyzed by a request parser, and then it is dispatched to the responsible component. The transaction control statements are forwarded to the transaction manager. The data definition statements are sent to the metadata manager. The object invocations are dispatched to the object store. The data manipulation statements are sent to the optimizer, which creates an optimized execution plan that is given to the execution layer.

The SAP HANA database also has built-in support for domain-specific models (such as for financial planning domain) and it offers scripting capabilities that allow application-specific calculations to run inside the database. It has its own scripting language named SQLScript that is designed to enable optimizations and parallelization. This SQLScript is based on side-effect free functions that operate on tables by using SQL queries for set processing.

The SAP HANA database also contains a component called the planning engine that allows financial planning applications to execute basic planning operations in the database layer. For example, while applying filters/transformations, a new version of a dataset will be created as a copy of an existing one. An example of planning operation is the disaggregation operation. In this operation, the target values from higher to lower aggregation levels are distributed based on a distribution function.

Metadata manager

Metadata manager helps to access metadata. SAP HANA database's metadata consists of a variety of objects, such as definitions of tables, views and indexes, SQLScript function definitions, and object store metadata. All these types of metadata are stored in one common catalog for all the SAP HANA database stores. Metadata is stored in tables in the row store. The SAP HANA features such as transaction support and multi-version concurrency control (MVCC) are also used for metadata management. Central metadata is shared across the servers in the case of a distributed database systems. The background mechanism of metadata storage and sharing is hidden from the components that use the metadata manager.

As row-based tables and columnar tables can be combined in one SQL statement, both the row and column engines must be able to consume the intermediate results. The main difference between the two engines is the way they process data: the row store operators process data in a row-at-a-time fashion, whereas column store operations (such as scan and aggregate) require the entire column to be available in contiguous memory locations. To exchange intermediate results created by each other, the row store provides results to the column store. The result materializes as complete rows in the memory, while the column store can expose results using the iterators interface needed by the row store.

Persistence layer

The persistence layer is responsible for durability and atomicity of transactions. The persistent layer ensures that the database is restored to the most recent committed state after a restart, and makes sure that transactions are either completely executed or completely rolled back. To achieve this in an efficient way, the persistence layer uses a combination of write-ahead logs, shadow paging, and savepoints. Moreover, the persistence layer also offers interfaces for writing and reading data. It also contains SAP HANA's logger that manages the transaction log.

Authorization manager

The authorization manager is invoked by other SAP HANA database components to check the required privileges for users to execute the requested operations. Privileges to other users or roles can be granted. A privilege grants the right to perform a specified operation (such as create, update, select, and execute data manipulation languages) on a specified object such as a table, view, and SQLScript function. Analytic privileges represent filters or hierarchy, and they drill down limitations for analytic queries. Analytic privileges such as granting access to values with a certain combination of dimension attributes are supported in SAP HANA. Users are authenticated either by the SAP HANA database itself (log in with username and password), or authentication can be delegated to external authentication providers third-party such as an LDAP directory.

For more information, you can refer to the SAP HANA at the following links:

In-Memory Computing

http://scn.sap.com/people/vitaliy.rudnytskiy/blog/2011/03/22/time-to-update-your-sap-hana-vocabulary