Optimizing the SAP BW Solution using SAP Data Services 4.0 and Preparing for In Memory Database Solution such as HANA.

Expedien and Kennametal Present – Optimizing SAP BW Solution using SAP DS 4.0 and In Memory DB solution such as HANA

Share This: Facebook Twitter Linkedin Email

HANA – Next Wave of High Performance BI

HANA stands for High Performance Analytic Appliance. HANA, is what is called an in-memory appliance, which means HANA loads substantial amounts of data from traditional disk storage into real computer memory, which allows the data retrieval and logical processing within memory at light speeds.  SAP’s In-Memory Appliance (SAP HANA) enables organizations to instantly explore and analyze all of their transactional and analytical data from virtually any data source in near real-time. Delivered on optimized hardware, HANA realizes the efficient processing and analysis of massive amounts of data by packaging SAP’s intelligent use of in-memory technology, columnar database design, data compression, and massive parallel processing together with essential tools and functionality (i.e. data replication, analytic modeling etc.), business content, and the SAP BusinessObjects Business Intelligence (SAP BusinessObjects BI) solutions.  In-memory computing holds data in RAM instead of being read from disks, providing a performance boost. HANA, which SAP launched last year, can tap data from both SAP and other sources, and the company has also started rolling out a series of specialized applications aimed at specific business problems.

How can you leverage HANA?

  • Organizations who do not want to disrupt their existing Business Intelligence setup, HANA can be deployed as a high performance “side-by-side” data mart to provide “real-time“ reporting and analytics. Existing BW customers  can deploy HANA (with its BAE component) in a “BWA for BW” mode and receive the in-memory acceleration features equivalent to a BWA appliance.
  • HANA can be leveraged as a high performance “side-by-side” data mart to your existing data warehouse or can take the place of a data warehouse (DW).
  • HANA can be deployed as the in-memory acceleration engine for BusinessObjects Explorer Accelerated version.
  • HANA can replace your existing backend data warehouse database system.

The benefits to the SAP customer base for HANA are significant. HANA will allow real-time decision making in areas never possible. An example given is the CEO of a big company like SAP can get any information about any SAP sales pursuit in any part of globe or product almost in real time (in seconds) on his iPad.

Share This: Facebook Twitter Linkedin Email

Starting a data governance program…

Enterprise Data management(EDM) refers to the ability of an organization to precisely define, easily integrate and effectively retrieve data for both internal applications and external communication.  EDM emphasizes data correction, consistency, precision, granularity and meaning and is concerned with how the content is integrated into business applications as well as how it is passed along from one business process to another.

Data governance is the crux of any enterprise data management startegy. Data Governance refers to the overall management of the  availability, usability, integrity, and security of the data employed in an enterprise. A sound data  governance program includes a governing body or council, a defined set of procedures, and a plan to execute those procedures. In practical terms, that means putting personnel, policies, procedures, and organizational structures in place to make data accurate, consistent, secure, and available to accomplish business goals. Effective data governance makes meaningful and correct data available to business
and hence it makes business process more efficient by saving money, allowing re-use of data, and supporting enterprise analytics. However, data governance requires more than just a few members of the IT staff with a project plan. It requires participation and commitment of both IT and business management, as well as senior-level executive sponsorship and active consultation with various business communities of interest.

In my last company, data governance was planned, managed, and implemented through a three level structure:

  • The Executive Data Governance Council provides strategic direction, ensuring that data governance efforts
    address all relevant and mission-critical needs of the enterprise. It manages data governance as an integrated program rather than as a set of unconnected projects.
  • The Strategic Data Governance Steering Committee carries out plans and policies to implement guidance from the Executive Data Governance  Council. It prioritizes data governance efforts and communicates with stakeholders, users, and other communities of interest.
  • The Tactical Data Governance working group implements plans and policies developed by the EDM Governance team, and analyzes and resolves any tactical problems that arise.

Communication is very important for successful data governance. To succeed in a data governance program, management bodies and  implementation team(s) must tell stakeholders what steps are being taken and why, must inform all relevant communities of interest about how data governance will benefit them, and must listen to stakeholders and communities of interest to incorporate their ideas and feedback into the data governance program. Input and feedback makes governance efforts more effective in achieving mission-critical goals and is vital for successful data governance.

Data Governance program needs continued interest and participation from business. Data owners should be from business side not from IT and they should be able to demonstrate the need, business value of data & ROI achieved by a data governance program.

Share This: Facebook Twitter Linkedin Email

A glance at SAP data migration methods….

What are the various methods available for SAP Data Migration?  I studied few ongoing prominent SAP Data Migration projects and had a discussion with our Data Migration team. As per my understanding, there are three popular methods for SAP data migration from legacy systems and/or old SAP R/3 to new SAP ECC system.

  • SAP Best Practices – Pre built contents based on SAP Data Services (ETL) that utilizes primarily IDOCs to load data into SAP.
  • LSMW – A utility by SAP that utilizes flat files to load data into SAP
  • Custom Developed Programs – Uses SAP BDC programs and flat files.

Each method has its advantages and disadvantages. I will discuss what I know about these methods, advantages and disadvantages of one method vs. another, challenges faced by clients by using any of these methods etc.  In this blog, I will talk about SAP Best Practices. In subsequent posts, I will discuss LSMW, Custom Developed Programs, Advantages, Disadvantages, Challenges etc.

 SAP Best Practices Method

Let’s talk about data migration from legacy(non-SAP) systems to SAP system. This includes new SAP customers as well as current customers who are bringing in new plants, new business units, etc., and need to convert data to a SAP ECC system.  SAP Information Lifecycle Management (ILM) is used for system decommissioning or data retention and archival. It is beyond the scope of this discussion at this time.

This method utilizes loading of data into SAP primarily by IDOCs. SAP acquired Business Objects tools such as Business Objects Data Integrator ETL, Data Quality (First Logic) and bundled it together with a new avatar “SAP Data Services”. The core strength of Business Objects Data Services, earlier known as Business Objects Data Integrator ETL or Acta ETL has been tight integration with SAP. This ETL tool was primarily used for SAP data extraction since its inception in 1998 or so. I have seen the evolution of tool from Acta 1.1 to SAP Data Services XI 4.x. There are some other Business Objects software too used in migration such as Data Insight (Data Profiling tool), Metadata Manager (these two tools now known as Information Steward) and some reports, but SAP Data Services is where the bulk of the work takes place. For those who don’t know – Business Objects America acquired a company Acta Technology in 2002 or so and SAP acquired Business Objects Americas in 2007. Business Objects renamed the Acta ETL as Business Objects Data Integrator after Acta acquision and later SAP renamed it as SAP Data Services.

Acta also offered SAP Rapid Marts. Rapid Marts are out of box pre-packaged Acta ETL code and target database schema based on Oracle or SQL Server databases for extraction of data from various SAP modules such as SD, IM, FI, CO, GL, HR and so on.  The value proposition of Rapid Marts has been that it gives a jump start to SAP customers in terms of getting data out of SAP quickly. Customers are generally able to leverage 65-70% of out of box Rapid Mart contents in its AS IS mode. Remaining contents can be easily customized based on customer’s SAP configuration etc. and generally entails addition of deletion of fields in tables in Rapid Marts, extraction of SAP custom table(s) if any etc. These Rapid Marts are standard SAP Data Mart offerings from SAP based on SAP Data Services now.

SAP has developed similar out of box SAP Data Services ETL codes for data migration to SAP based on standard SAP ECC Master data structures. These are called Best Practice(BP) Content for Data Migration.  It is also known as SAP AIO BP, which is nothing but “SAP Business All-in-One” Best Practices. It is confusing to see so many new SAP terms but don’t let it scare you. SAP is pioneer in coming up with new buzzwords however core contents remain more or less the same behind the scenes.

The BP content for Data Migration can be found under the Data Migration, Cross-Industry Packages in Best Practices section in HELP portal.    This content has everything you need to get started on migrating non-SAP data to an SAP system.  The content includes the following:  guides to install SAP data services and other components required for the migration, actual content to load that includes jobs to load data into SAP via IDOCs, mapping tools to help you map the non-SAP data to the IDOC structure, and some reports.   It includes IDOC mapping and structures for objects like Material Master, Vendor Mater and Customer Master, Pricing, BOM, Cost element, Payables and Receivables contents.   There are detailed word documents on each piece of content, for example a document on Material that is a 39 page word document, covering the IDOC structures, what you need to know, and how to map data to the structure.

SAP also provides standard data migration methodology, framework, templates, based on SAP Best practices and SAP Data Services. Methodology has components – Analyze, Extract, Cleanse, Validate, Upload and Reconcile legacy data into a SAP ERP environment.

This method of data migration using SAP Best Practices and IDOCs work very well in case no customization is required for data migration. What it means is that if a customer has standard SAP ECC vanilla implementaion, this method works just GREAT. For example, a SAP Best Practices per built job for material master loads the data as per standard ECC Material Master IDOC structure. In case customer needs more fields, or a custom table is to be loaded in Material Master, it is easy to modify or add to SAP Best Practices ETL code however along modify BP code will not suffice. Corresponding SAP IDOCs need to be modified or extended as well which may or may not be allowed by customer’s SAP Basis team. Customer will also need SAP ABAP/IDOC expertise on the project to modify IDOC structure. Many customers don’t prefer to modify standard  IDOCs.

Another scenario where SAP Best Practice will not work is if there is no one to one mapping between the input and output data. In other words if master data element to be convereted into SAP ECC is dependent on more than one dimension of input data, SAP Best Practices will not work. Let’s take an example, if sales org A in legacy system is to be converted into sales org B in SAP ECC, SAP Best Practice will work great. However if there are three sales orgs A, B, C in legacy systems and there is needed only one sales org D in SAP ECC with value dependent on three dimensions such as Sales org, Plant, Country code in source data in legacy systems, SAP Best Practice can’t handle this conversion scenario at least as of today.  In this case, a good amount of customization needs to be done in SAP Best Practices code, tables, scripts etc which may not be worth the efforts and may impact the integrity of SAP Best Practices contents dependent on the modified content/code.

A similar approach is taken for data migration from one or many SAP systems, Legacy System to SAP ECC system. In this option, maybe you have multiple SAP systems on different releases, so one on 4.6c, 4.7 and you want consolidate to a single ECC 6.0 system.  You can use SAP Data Services to extract data from old SAP system, non-SAP system and use same methodology, framework and SAP Best Practices to load data into SAP ECC similar to what we discussed above.

Share This: Facebook Twitter Linkedin Email

Poor Data – Don’t Just Treat Symptoms, Treat The Cause.

The Data Warehousing Institute estimates that data quality problems currently cost U.S. businesses over $600 billion annually. Even with these figures to guide us, it is still very difficult to use metrics to determine the cost of poor data quality and its effects on your organization. This is because making the mistake may be too distant from recognizing the mistake. Errors are very hard to repair, especially when systems extend far across the enterprise, and the final impact is very unpredictable.

Have you ever considered how much time and resources your organization spends on correcting, fixing and analyzing corrupted or erroneous data? What about the cost of delayed information exchange or lost revenue due to misplaced data or incorrect input? Evaluating data and determining errors is a time consuming process, not to mention the time needed to correct them. In a time of decreased budgets, some organizations may not have the resources for such projects and may not even be aware of the problem. Others may be spending all their time fixing problems leaving no time to work on preventing them.

According to several leading data quality managers, the cost of poor data quality may be expressed as simple formula that equates into:

Cost of Poor Data Quality  = Lost Business Value + Cost to Prevent Errors + Cost to Correct Errors + Cost of Validation

Loss of Business Value can be HUGE and can lead to business interruptions as well. Let’s use an example to illustrate the cost of fixing an element of poor data.

  1. A staff person spends about 40% of their time each day on this task
  2. There are five  people performing this operation (5 x 3.2 hours = 16 staff hours per day.)
  3. Accounting tells you that these people earn  $45 per hour (payroll + benefits.)
  4.  Total annual hours of cleanup is 4000 hours annually (16 staff hours x 250 annual working days.)This means the annualized cost to fix the known poor data is $180,000.

This cost of the poor data quality extends far beyond the cost to fix it.  It spreads through and across the business enterprise like a virus affecting systems from shipping and receiving to accounting and customer service. Eventually, your customers may lose patience with you, and you may lose their business.

Let’s look at traditional approach to cleanse the data when ever data quality issue is recognized in a business. The traditional approach to correct bad data fixes the bad data that’s already been created with data quality or ETL tools. This generally happens whenever there is an urgent need to fix the bad data either because of needs arisen from a data migration effort or a business problem.

This approach suffers from three problems:

First, data cleansing and repository building are almost always carried out on a project by project basis. Even if the project is successful, and bad data is transformed to good data, the repository starts to degrade in absence of any ongoing data quality sustenance measures. More and more newly created bad data will creep into the system. And the data already cleansed start getting stale. Data has a shelf-life and needs constant care and feeding. Without addressing how bad data is created, these solutions are costly and unsustainable.

Second, it’s difficult to get the business side fully committed to and involved in these projects. Without a change of mindset, data continues to be seen as IT’s responsibility. And to exacerbate the problem, the software tools used were meant for an IT user base, which leaves the business without a way to directly participate in the process. Without full and sustained business engagement, these projects often do not yield anticipated benefits.

Third, it is very, very hard to fix bad data using technical tools alone. A computer algorithm for data cleansing, no matter how cleverly constructed, can only address a very small subset of data problems. A data cleansing package would not even be able to detect that there is a problem, let alone fix it.

However, by and large these efforts treat the symptoms of disease that surfaced, rather than addressing the root cause. Strictly speaking, these projects represent a cost of bad data in addition to degradation of business performance.  Organization can take it as an opportunity to find root cause of bad data and identify people, process or technology related issues.  Once the root causes are identified, there MUST be a data governance strategy sponsored, implemented and owned by business.

The bottom line is data ownership and data contents shouldn’t be IT’s responsibility. With data volume and complexity exploding, the treadmill is spinning faster than the traditional approach’s ability to keep up.

HANA – Next Generation Data Management?

HANA stands for High Performance Analytic Appliance. HANA, is what is called an in-memory appliance, which means HANA loads substantial amounts of data from traditional disk storage into real computer memory, which allows the data retrieval and logical processing within memory at light speeds. In-memory computing holds data in RAM instead of being read from disks, providing a performance boost. HANA, which SAP launched last year, can tap data from both SAP and other sources, and the company has also started rolling out a series of specialized applications aimed at specific business problems. HANA is now available in appliance form. Hardware from Hewlett-Packard, Dell, and IBM, among others, has been certified to run HANA.

This appliance is usually made up of blade processing systems with large amounts of memory. This technology is not new, and has in fact been around for years in various forms. SAP has been using memory appliances for years with their BW accelerator offering.

The main difference with HANA is that SAP is building the HANA appliances with specific logic, based on the SAP solution it is supporting. HANA is released for certain business processes, the appliance can literally be plugged into the network, identified to the correspondingSAP solution, and turned loose!  The appliance will run side-by-side with the traditional SAP solution, allowing amazing speed increases for typical business processes. We are talking hours to seconds here.

HANA entered ramp-up in the fourth quarter of 2010 and continues there today. According to SAP, General Availability (GA) is targeted for the first half of 2011. SAP is also listing the HANA solutions expected to come out first, most likely in Q3 or Q4 of this year as per following schedule.

  • Strategic workforce planning (out now)
  • Sales and operations planning (Q3, 2011)
  • Cash and liquidity management (Q3, 2011)
  • Trade promo management (Q4, 2011)
  • Smart meter analysis (Q4, 2011)
  • Profitability engine (Q4, 2011)
  • Customer revenue performance management (Q4, 2011)
  • Merchandising and assortment management (Q4, 2011)
  • Energy management for utility customers (Q4, 2011)
  • Customer-specific pricing (Q4, 2011)
  • Intelligent Payment Broker (Q4, 2011)
  • SAP BW on HANA (Service Pack release in 2011)

Back to the technology, the benchmark numbers being reported so far are very promising. For example SAP ran benchmarks with IBM on real-world ERP business scenarios and was able to run more than 10,000 queries an hour against 1.3 terabytes of data made up of SAP ERP sales data tables. The query response time was literally seconds. Some other numbers now being published include reading and processing 460 billion records in less than 60 seconds on a 3 terabyte machine. Tests on massive reports are also showing latency being reduced from 2 hours to 5 seconds. These are very impressive results. Also remember that each HANA appliance release has targeted algorithms based on the solution it is supporting, or in other words the HANA appliance is somewhat intelligent in that when matched with theSAP solution, it will target the correct database tables and views for memory so as to provide that massive performance lift.

The benefits to the SAP customer base for HANA are significant. HANA will allow real-time decision making in areas never possible. An example given is the CEO of a big company like SAP can get any information about any SAP sales pursuit in any part of globe or product almost in real time (in seconds) on his iPad.

Even with the additional hardware needed to support the HANA solution, there should be a TCO play based on the business value that HANA will bring. There are also some challenges that need to be addressed around overall landscape design, sizing, tuning, and disaster recovery, but these challenges should be flushed out quickly and would not impede, in any way, the value of moving quickly to this new technology platform as your business demands.