Unlocking Data Insights: SAP ECC to Snowflake Integration with Qlik on Azure Cloud

Introduction

In today’s data-driven world, businesses are constantly seeking ways to harness the power of their data for strategic decision-making. As organizations evolve and embrace modern data warehousing solutions, integrating data from diverse sources becomes crucial. In this blog post, we will explore how to unlock valuable data insights by seamlessly transferring data from SAP ECC to Snowflake using Qlik, a leading data integration and analytics platform.

Why Integrate SAP ECC with Snowflake?

SAP ECC is a widely used enterprise resource planning (ERP) system that holds a wealth of critical business data. Snowflake, on the other hand, is a highly scalable and cloud-native data warehousing solution known for its exceptional performance and flexibility. By integrating SAP ECC with Snowflake, businesses can unlock several benefits:

  1. Unified Data Platform: Snowflake provides a centralized platform for storing and analyzing data from SAP ECC alongside other data sources, enabling a holistic view of the organization’s operations.
  2. Scalability and Performance: Snowflake’s architecture allows for effortless scalability, accommodating large volumes of data while delivering exceptional query performance, even with complex analytical workloads.
  3. Cost Efficiency: With Snowflake’s pay-as-you-go pricing model, businesses can optimize costs by scaling resources based on demand, eliminating the need for upfront infrastructure investments.
  4. Advanced Analytics and Insights: By leveraging Qlik’s powerful data integration and analytics capabilities, businesses can combine data from SAP ECC with other sources, enabling advanced analytics, data visualization, and real-time insights.

Guide to SAP ECC to Snowflake Integration using Qlik on Azure Cloud

Proposed Reference Architecture

The following outlines the basic reference architecture for the EDP on Azure Cloud Platform.

Critical points of the Architecture

  • Qlik Replicate is used to move raw data to Snowflake.
  • Qlik Compose is the tool to model relationships, build data warehouses and generate data marts for each SAP module. Qlik Compose also has pre-built SAP data models.
  • Easy to adopt more services such as Machine Learning or OLAP models to work over data down the line.
  • Can be easily integrated with partner or customer API endpoints both as a consumer or a producer.
  • Can easily connect to popular end-user tools like Excel, Qlik Sense or Power BI.

Components of the Reference Architecture

ComponentsExplanation
Data WarehouseSnowflake Enterprise Edition hosted on Azure
Data IntegrationQlik Replicate
Data Modeling and ETLsQlik Compose for modeling relationships and defining ETLs to build Warehouse and data marts for reporting
NetworkingHub and Spoke topology. The main hub network will have connectivity to On Prem and is peered by EDP spoke networks.
MonitoringAzure monitor and Qlik monitoring Apps
DeploymentTerraform
Code RepositoryAzure DevOps Repos
AnalyticsPowerBI, QlikSense etc.

Supported SAP Packages for Integration Tooling

  • Primarily SAP ERP / ECC 6.0 + all EhP levels
  • All modules are supported.
  • Also supports CRM, SRM, GTS and MDG SAP Applications

Limitations for SAP Integration Tooling

When using SAP ECC as a source endpoint in a replicate task, the following limitations apply:

  • A task with a SAP Application source and a File Channel target may replicate some tables twice – the requested table and the underlying table. To prevent this from happening, exclude the underlying table from the list of tables to be replicated.
  • When a task is defined with a SAP Application source, the Applied Changes Details monitoring metrics in the Change Processing tab may be incorrect for clustered and pooled tables.
  • During Full Load of clustered and pooled tables, the Estimated Count value in Monitor view may be incorrect. Additionally, as the Progress Bar is based on the Estimated Count value, it may not represent an accurate portrayal of the actual task progress.
  • Replication of views is not supported.
  • Changes applied to cluster tables are in transactional mode only. If you need the Batch Optimized Apply functionality, consider using the SAP Application (DB) endpoint instead.
  • Unlimited LOB mode is not supported.
  • The source_lookup Data Enrichment function is not supported. For more information on this function, see Data Enrichment functions
  • When replicating from the HR module, only transparent tables are supported.

Data Zones

Data Zones are critical logical and physical construction blocks for data management. They form the building blocks of the EDP by providing working Zones for different types of users. In a practical sense they are different containers inside your Data Warehouse storage.

  1. The SAP ECC data sources are loaded into the landing area using Qlik Replicate. While the landing area is defined as the target in the replication task, it must be defined as the data source in a Compose for Data Warehouses task. The landing area can either be a schema or a database in the Snowflake data warehouse (depending on how it is defined in the replication task). When using Qlik Replicate to move the source table to the landing area, you can define either a Full Load replication task or a Full Load and Store Changes replication task. Defining a Full Load and Store Changes replication task is recommended if you want to constantly propagate the source table changes to the data warehouse.
  2. After the source tables are present in the landing area, Compose for Data Warehouses auto-generates a model based on the data source(s). Several methods are available for generating the model.
  3. After the model has been created, you can extend the model with derived attributes, if needed.
  4. Once the model has been finalized, instruct Compose for Data Warehouses to create the data warehouse tables (i.e. the staging tables and the actual data warehouse tables).
  5. You can fine-tune the mappings between the tables in the landing area and the staging tables. You can also create custom ETLs and expressions as required. The Mappings ETL and the Custom ETLs are collectively referred to as the “ETL Set”.
  6. After the proper mappings are in place and the ETL Set has been generated, a single click is all it takes to extract the source data and load it into the data warehouse.
  7. Once the data warehouse has been loaded, you can then select which tables to use as the fact, dimension and aggregation tables in your data mart(s).
  8. When the data mart design is complete, Compose for Data Warehouses creates the data mart tables, generates the data mart ETLs and loads the data mart with the selected data.

Note that even though setting up the initial project involves both manual and automatic operations, once the project is set up, you can automate the ETL tasks using the Compose for Data Warehouses scheduler.

Data Integrations, Modeling and ETL Process

Following diagram illustrates the integration from SAP ECC to Snowflake.

Following steps are involved creating an end to end SAP Data Pipeline for Snowflake powered by Qlik Data Integration(QDI):

  1. Use Qlik Replicate integration task to move raw data from SAP ECC to Snowflake. The task will replicate and create materialized tables in Snowflake from SAP data sources. Tasks can be created for full load or with option to apply changes after full load as well for continuous integration.
  2. Now that raw data is moved to snowflake the next step would be to turn it into usable content using Qlik Compose. The first part will be to take raw shorthand German codes and apply SAP metadata so the fields can be turned into friendly names that business users can easily understand.
  3. Once we start to understand the data then we need to model relationships and this is where we use Qlik Compose to understand the flow and transformations of SAP data from different extractors which are now tables for both facts and dimensions and how they work through each other.
  4. Then we go and create the underlying warehouse. This is all pushed down to run on Snowflake directly. All SQL and DDL is run on Snowflake directly using virtual warehouses in Snowflake. Creating the warehouse using Qlik Compose means that surrogates, slowly changing dimensions and keys are managed by compose.
  5. Once we have the warehouse then we can get the data ready for analytics and that is done by creating analytical data marts.
    1. These data marts can be created as transactional entities or,
    2. These data marts can be created as aggregated data marts.
  6. The type of mart is governed by reporting grain and that can be managed by Qlik Compose.

Conclusion

Integrating SAP ECC with Snowflake using Qlik unlocks a world of data insights and empowers organizations to make informed decisions. By following the step-by-step guide outlined in this blog post, businesses can seamlessly transfer data from SAP ECC to Snowflake, leveraging Qlik’s robust data integration and analytics capabilities. With a unified data platform, scalability, cost efficiency, and advanced analytics at their disposal, organizations can drive innovation, enhance operational efficiency, and gain a competitive edge in today’s data-driven landscape.

Leave a comment