Many experienced developers are not aware that moving BW to HANA can in some cases result in slower transformations during data loads. Thankfully, SAP is providing a tool to identify areas in your ABAP code that may benefit from an optimization. In this blog we take a quick look at this valuable tool.
By Dr. Berg
During development of SAP BW Extract Transforms and data Load (ETL) there are many options to create transformations in DTPs in BW 7.x and in older Update and Transfer rules in BW 3.x. Unfortunately, depending on your developers, some of the custom transforms may have sub-optimal ABAP coding can have impacts on how SAP BW ETL performs after the migration to HANA.
Thankfully, SAP’s talented developer Marc Bernard provides an automated tool to help you identify potential coding issues. This program is called the “SAP NetWeaver BW ABAP Routine Analyzer”, or technically ZBW_ABAP_ANALYZER.
Figure 1: Input Screen for SAP NetWeaver BW ABAP Routine Analyzer
Overall, the tool looks for and suggests the following fixes:
a. Select for all entries (SFAE) statements without HANA DB hints --> add hints
b. Select * --> specify fields to select
c. Database access in the field routines --> move to start routine
d. Loops which do not assign field symbols --> use field symbols
e. Selects from master data tables --> Use the read master data rule
f. Selects from DSOs --> Use the read DSO rule
g. Direct updates to BW object tables --> Do not update tables directly
h. and many more areas...
You can run this program either as on-line, or as a background job. If you run the program on-line (only recommended for a small selection), you can click a link in the output to edit the transformation, transfer rule or update rule.
Figure 2: On-Line Output for SAP NetWeaver BW ABAP Routine Analyzer
If you run in the background, you can save a log to SLG1 and also create a spool list for further analysis.
Figure 3: SAP NetWeaver BW ABAP Routine Analyzer - On-Line Spool list
For organizations that have slow data loads where most of the time is used on transforms and loads, this is a key program that should be run as part of the BW HANA migration planning. The tool should be available for general access in April this year.
The last six weeks in this HANA blog series we have looked at many of the HANA migration tools that Marc and SAP has provided, and I continue to be amazed at both the quality and the speed at which SAP is rolling out tools to make any HANA migration as easy as possible.
In the next blog we will look an in-depth look at some of the ways you can balance the load on the nodes in a HANA system when working with a scale out environment.
Another installment in the serie of BW performance tuning ideas. While not comprehensive, it is intended to increase the knowlegde of BW developers with hints, suggestions, explanations and references to SAP Notes that allows developer to improved BW Performance. In this session we look at 5 different areas.
By Dr. Berg
NOTE: I strongly recommend the book "SAP NetWeaver Business Warehouse: Administration and Monitoring " 2010 edition by Olaf Klostermann and Olaf Klostermann it is 590 pages of really useful info.
If you want to see how your BW system is setup, goto SA38 and run the program RSPARAM report, it will show you all setting. Just place the cursor on a field in the report and press F1 and you get all details
Buffering and slow log-file writes
If you have a very long-time log-file-sync, you should buffer Dimensional IDs (DIMID), System IDs (SIDS) and hierarchy IDs (HIEID) to decrease the time the system spends committing individual records to the NRIV table and syncing the log files. Goto SNRO, click Edit -> Set Buffering -> Main memory try setting this to a higher number (normal is upto 1000, but for a very high number of IDs, you can set this upto 10,000). Another idea is to split the log and data files on different RAID sets. This may be hard to do on a Storage Area Network (SAN), but for systems with over 40-60ms on each sync, moving the log files can make a significant performance difference (PS!: the log file sync times are listed in your EarlyWatch report as a seperate section).
When you flag a dimension as “high cardinality” SAP BI will use a b-tree index instead of a bit-map index. This can be substantially slower if the high cardinality does not exist in the data in general. Consider only flagging high-cardinality dimensions if a substantial amount of records exists (more than 8-10 million) or if the system is substantially unbalanced (i.e. 30+%), not the 20% we taught in classes a few years ago...
I recommend that developers start looking at queries by using the RSRT transaction and click on the "Performance" tab. And get the list of queries you want to examine. Also use the RSRV checks to see if database patches are current, aggregates are recommended, indexes and statistics are updated and much more. It is a great place to start the analysis. Also, familiarize your self with all performance information in RSRT as well.
To avoid an overflow of the memory, parallel processing is cancelled as soon as the collected result contains 30,000 rows or more and there is at least one incomplete sub process. The MultiProvider query is then restarted automatically and processed sequentially. What appears to be parallel processing, is actually sequential processing plus the startup phase of parallel processing. You can change the default value of 30,000 rows, or change the size of the result set (MPRO_MAX_RESULT) to higher than 5MB for queries with large result sets that are hitting multiproviders with more than one infocube and/or DSO (see notes 629541, 622841, 607164, and 630500).
A cool feature in SAP_BASIS Service pack 12, is a new DBA cockpit (DBACOCKPIT). It has more functions than you had before and you no longer need to use the 'old' codes. They are all available in the new DBA cockpit. DB02-Space management, DB12-backup, DB13/DB13C - configuration, DB24-jobs ST04-performance, While I remember: Make sure you also upgrade to latest kernel patch before using the cockpit, or you may get a bunch of error messages (programs missing).
PS! you probably will have to change to the new central scheduling calendar (you can migrate using the DB13C transaction)..
In this demo I show you step-by-step how to load non-SAP data from files into new SAP HANA tables using SAP DataServices as the ETL tool. We also look at how browse the HANA system tables, display data, the admin console, explore HANA views and the information modeler.
By Dr. Berg (the co-author of the new SAP HANA book from SAP Press)
We start the demo with a quick tour or SAP HANA. Then we create 3 tables (customer, products and sales) and then load the tables using SAP DataServices.
Please click on the demo and select HD resolution on the settings at the bottom. The demo also have audio.
The core of the new optional SAP HANA-optimized InfoCube is that when you assign characteristics and/or key figures to dimensions, while the system does not create any dimension tables except for the package dimension.
Instead, the master data identifiers (SIDs) are simply written in the fact table and the dimensional keys (DIM IDs) are no longer used, resulting in faster data read execution and data loads. In short, dimensions become logical units instead of physical data tables. The logical concept of 'dimensions' is used only to simplify the query development in BEx Query Designer.
It is important to note that since the physical star-schema tables change during the HANA optimization, any custom developed program that access InfoCubes directly instead of going through standard interfaces, will have to be re-written. I.e., companies using ODBC directly at the database level.
However, since very few companies have ventured into this area, the optional HANA InfoCube conversion will have little impact to most organization except for providing faster InfoCube performance.
To convert existing InfoCubes, simply go to the program RSDRI_CONVERT_CUBE_TO_INMEMORY and select the InfoCubes you want to convert. The job is execute in the background as a store procedure and is extremely fast. Typically, you can expect 10-20 minutes even for very large InfoCubes with hundreds of millions of rows. During the conversion, users can even query the InfoCubes. However, data loads must be suspended. Currently, traditional InfoCubes with a maximum of 233 key figures and 248 characteristics can be converted to HANA optimized InfoCubes.
After the conversion to HANA optimized InfoCubes are maintained in the SAP HANA database's column-based store and are assigned a logical index (CalculationScenario). However, if the InfoCube was stored only in BWA before the conversion, the InfoCubes are set to inactive during the conversion and you will need to re-activate it and reload the data if you want to use it.
Do I Need InfoCubes In HANA?
Many have asked, if InfoCubes are needed with a HANA system. Currently, there is significant debate/arguments on blogs and forums on the Internet on this topic. However, for the interim period there are several reasons why InfoCubes are needed:
First, transactional InfoCubes are needed for Integrated Planning and write-back options. InfoCubes are also needed to store and manage non-cumulative key figures and the RSDRI write interface only works for InfoCubes. In addition, the transition from SAP BW to HANA is simplified by allowing customers to move to the new platform without having to rewrite application logic, queries, MultiProviders and data transformations from DSOs to InfoCubes.
However, the continued use of InfoCubes has to be questioned. The propose of introducing the star-schema, snowflakes and other Dimensional Data Modeling (DDM) techniques in the 1990s was to reduce costly table joins in relational databases, while avoiding the data redundancy of data stored in 1st normal form (1NF) in Operational Data Stores (ODSs).
The removal of the relational database from HANA's in-memory processing makes most of the benefits of DDM mute, and continued use of these structures is questionable. In the future, we may see multi-layered DSOs with different data retention and granularity instead. But, for now InfoCubes will serve a transitional data storage role for most companies.