SAP CDS View Extraction Pipeline is an 8-phase Java pipeline that runs on
sapidess4 via JCo RFC. It extracts CDS view dependencies, metadata, and SQL definitions
from SAP S/4HANA, producing CSV files that flow through a Fivetran SDK connector into GCS Parquet
and ultimately Google Cloud Storage Buckets.
connector.py
reads them → Fivetran SDK pushes to GCS Parquet files in Google Cloud Storage Buckets.
/usr/sap/cds_sql_only/gs://sap_cds_dbt/sap_cds_views/gs://sap_cds_dbt/sap_cds_views/Production directory on sapidess4:
/usr/sap/cds_sql_only/ connector.py # Fivetran SDK connector (reads CSV, pushes to GCS) configuration.json # Root CDS views to extract drivers/ installation.sh # Setup script configuration.json # JCo connection parameters sapjco3.jar # SAP JCo library libsapjco3.so # Native JCo library (Linux) gson.jar # JSON parsing SimpleDependencyTable.java/.class # Phase 1 Phase2Recursive.java/.class # Phase 2 Phase3Descriptions.java/.class # Phase 3 Phase4FieldMetadataFixed.java/.class# Phase 4 Phase5SqlDefinitions.java/.class # Phase 5 Phase6SqlOnlyViews.java/.class # Phase 6 Phase7ViewMetadata.java/.class # Phase 7 Phase8ChainResolution.java/.class # Phase 8 DependencyResolver.java/.class # Shared: recursive dependency resolver DependencyNode.java/.class # Shared: tree node model DependencyGraph.java/.class # Shared: graph data structure TestZCdsSqlViews.java/.class # RFC test harness SQL_NAME_MAPPING.csv # CDS entity → HANA SQL name mapping
Performs a BFS (Breadth-First Search) over the SAP DDLDEPENDENCY table starting from
root CDS views specified in configuration.json. Discovers the full dependency tree
of all objects referenced by the root views.
SimpleDependencyTableRFC_READ_TABLE on DDLDEPENDENCYSIMPLE_DEPENDENCY_TABLE.csv
Uses the custom Z_CDS_DEPENDENCIES RFC plus RFC_READ_TABLE to
recursively resolve all dependencies and classify each object as a CDS view, database table,
table function, or other type.
Phase2RecursiveZ_CDS_DEPENDENCIES, RFC_READ_TABLEPHASE2_DEPENDENCIES.csv
Extracts human-readable descriptions for all discovered objects from the SAP DD02T
(Data Dictionary text table). Runs twice in the pipeline: once after Phase 2, and again after
Phase 8 to pick up newly discovered objects.
Phase3DescriptionsRFC_READ_TABLE on DD02TOBJECT_DESCRIPTIONS.csvall_descriptions
The core engine of the pipeline. Uses the custom Z_CDS_SQL_VIEWS RFC to perform
BFS chain resolution, parsing BASEINFO structures to determine which dependencies are real
data sources (FROM clause) vs. association pointers. Implements the
ignore_association logic that can cut the dependency tree in half.
Phase8ChainResolutionZ_CDS_SQL_VIEWSsql_only_resolution, sql_only_objectsFROM array (real data dependencies) and an ASSOCIATED array
(association pointers). Objects in the ASSOCIATED array get ignore_association = YES
and can be safely excluded from SQL translation.
Extracts detailed field-level metadata for every object using RFC_READ_TABLE
on the SAP Data Dictionary tables DD03L (field definitions) and
DD03VT (field descriptions).
Phase4FieldMetadataFixedRFC_READ_TABLE on DD03L, DD03VT{OBJECT_NAME}_METADATA.csvmetadata_all
Retrieves the actual HANA CREATE VIEW SQL statements for each CDS view using
the custom Z_VIEW_DDL RFC. These are the native HANA SQL definitions that
can be translated to ANSI SQL for various target platforms (Snowflake, Databricks, etc.).
Phase5SqlDefinitionsZ_VIEW_DDL{view}_sql_definition
Extends Phase 5 by retrieving both the HANA SQL and the ABAP SQL definition for
"SQL-only" views — views identified by Phase 8 as needing only real data dependencies (no associations).
Uses both Z_VIEW_DDL and Z_CDS_DEPENDENCIES.
Phase6SqlOnlyViewsZ_VIEW_DDL, Z_CDS_DEPENDENCIES{view}_abap_sql_definition, {view}_hana_sql_definition
Retrieves comprehensive Data Dictionary metadata using the custom Z_VIEW_GET_META
RFC, which returns data from four SAP tables in a single call: DD02L (table header),
DD02T (table texts), DD26S (view structure/joins), and
DD03L (field definitions).
Phase7ViewMetadataZ_VIEW_GET_METAdd02l_all, dd02t_all, dd26s_all, dd03l_allFour custom Z_* function modules were developed in ABAP and deployed to the SAP system to support the pipeline.
BFS chain resolution engine. The most complex RFC — walks the CDS dependency tree and returns BASEINFO structures with FROM/ASSOCIATED arrays.
| Direction | Parameter | Type | Description |
|---|---|---|---|
| IMPORTING | IV_VIEWNAME | CHAR(30) | CDS view name to resolve |
| IMPORTING | IV_MAX_DEPTH | INT4 | Maximum recursion depth |
| EXPORTING | ET_RESOLUTION | TABLE | Resolved dependency chain with BASEINFO |
| EXPORTING | ET_OBJECTS | TABLE | All discovered objects with ignore_association flag |
Returns the HANA native CREATE VIEW SQL definition for a given CDS view.
| Direction | Parameter | Type | Description |
|---|---|---|---|
| IMPORTING | IV_VIEWNAME | CHAR(30) | View name (ABAP Dictionary name) |
| EXPORTING | EV_SQL | STRING | Full HANA CREATE VIEW SQL statement |
| EXPORTING | EV_ABAP_SQL | STRING | ABAP CDS SQL definition (if available) |
Walks the CDS dependency tree and returns parent-child relationships with object classification.
| Direction | Parameter | Type | Description |
|---|---|---|---|
| IMPORTING | IV_VIEWNAME | CHAR(30) | Root view name |
| IMPORTING | IV_DEPTH | INT4 | Max traversal depth |
| EXPORTING | ET_DEPENDENCIES | TABLE | Dependency tree with classification |
Returns comprehensive Data Dictionary metadata (DD02L, DD02T, DD26S, DD03L) for a view in a single call.
| Direction | Parameter | Type | Description |
|---|---|---|---|
| IMPORTING | IV_VIEWNAME | CHAR(30) | View name |
| EXPORTING | ET_DD02L | TABLE | Table/view header records |
| EXPORTING | ET_DD02T | TABLE | Table/view description texts |
| EXPORTING | ET_DD26S | TABLE | View structure (join conditions) |
| EXPORTING | ET_DD03L | TABLE | Field definitions |
SAP HANA SQL extracted from CDS views requires mechanical translation before it can run on target platforms like Snowflake or Databricks. The translation falls into three categories:
| HANA SQL Pattern | ANSI / Target SQL | Notes |
|---|---|---|
"SAPHANADB"."TABLENAME" | TABLENAME | Remove schema prefix entirely |
N'text' | 'text' | Remove N prefix (national character literal) |
/*..CARDINALITY..*/ | (remove) | HANA cardinality hints — strip completely |
SESSION_CONTEXT('CDS_CLIENT') | '100' | Replace with hardcoded client value |
"alias"."field" | alias.field | Adjust quoting per target dialect |
IFNULL(x, y) | COALESCE(x, y) | Standard ANSI equivalent |
TO_NVARCHAR(x) | CAST(x AS STRING) | Standard string cast |
| SAP HANA Function | Google Standard SQL | Snowflake | Databricks |
|---|---|---|---|
DATS_TIMS_TO_TSTMP | PARSE_TIMESTAMP | TO_TIMESTAMP_NTZ | TO_TIMESTAMP |
TSTMP_TO_DATS | FORMAT_TIMESTAMP | TO_VARCHAR | DATE_FORMAT |
FLTP_TO_DEC | CAST(x AS NUMERIC) | TO_DECIMAL | CAST(x AS DECIMAL) |
TSTMP_CURRENT_UTCTIMESTAMP | CURRENT_TIMESTAMP() | CURRENT_TIMESTAMP() | CURRENT_TIMESTAMP() |
ABAP_SYSTEM_TIMEZONE | 'UTC' | 'UTC' | 'UTC' |
TSTMP_TO_DST | FORMAT_TIMESTAMP | TO_VARCHAR | DATE_FORMAT |
| Feature | Google Standard SQL | Snowflake | Databricks |
|---|---|---|---|
| Identifier Quoting | Backticks `name` | Double quotes "name" | Backticks `name` |
| Schema Prefix | dataset.table | schema.table | catalog.schema.table |
| String Type | STRING | VARCHAR | STRING |
| Timestamp Type | TIMESTAMP | TIMESTAMP_NTZ | TIMESTAMP |
| Integer Division | Truncates | Truncates | Returns DOUBLE |
Every CDS view has three distinct names. Understanding the mapping is critical for SQL translation.
| Layer | Example | Where Used |
|---|---|---|
| CDS Entity Name | I_Product | ABAP CDS source code, annotations |
| ABAP Dictionary Name | IPRODUCT | SAP Data Dictionary (SE11), DD03L, RFC_READ_TABLE |
| HANA SQL Name | "IPRODUCT" | HANA CREATE VIEW statements, HANA catalog |
The ignore_association flag is the key optimization in Phase 8. It classifies each dependency as either a real data source or an association pointer.
Object appears in the BASEINFO FROM array. This is a real data dependency — the view's SQL SELECT actually reads data from this object. Must be included in any SQL translation.
Object appears in the BASEINFO ASSOCIATED array. This is an association pointer — it defines a navigation path but the view does not read data from it directly. Can be safely excluded.
ignore_association = YES objects can cut the
dependency tree roughly in half, dramatically reducing the number of views
that need SQL translation.
-- Example: BASEINFO structure returned by Z_CDS_SQL_VIEWS BASEINFO: { "FROM": [ "MARA", -- real table dependency (ignore_association = NO) "IPRODUCT" -- real view dependency (ignore_association = NO) ], "ASSOCIATED": [ "I_PRODUCTTEXT", -- association pointer (ignore_association = YES) "I_PLANT" -- association pointer (ignore_association = YES) ] }
All Parquet files are stored in the gs://sap_cds_dbt/sap_cds_views/ GCS Bucket, pushed via the Fivetran SDK connector.
| Parquet File | Source Phase | Key Columns | Description |
|---|---|---|---|
all_descriptions | Phase 3 | TABNAME, DDLANGUAGE | DD02T descriptions for all objects |
metadata_all | Phase 4 | TABNAME, FIELDNAME | Field-level metadata (DD03L + DD03VT) |
{view}_sql_definition | Phase 5 | VIEW_NAME | HANA CREATE VIEW SQL per view |
{view}_abap_sql_definition | Phase 6 | VIEW_NAME | ABAP CDS SQL definition per view |
{view}_hana_sql_definition | Phase 6 | VIEW_NAME | HANA native SQL definition per view |
sql_only_resolution | Phase 8 | VIEW_NAME, DEPTH | Full BFS chain resolution results |
sql_only_objects | Phase 8 | OBJECT_NAME | All objects with ignore_association flag |
dd02l_all | Phase 7 | TABNAME | Table/view header (DD02L) |
dd02t_all | Phase 7 | TABNAME, DDLANGUAGE | Table/view texts (DD02T) |
dd26s_all | Phase 7 | VIEWNAME, TABNAME | View structure/joins (DD26S) |
dd03l_all | Phase 7 | TABNAME, FIELDNAME | Field definitions (DD03L) |
Each phase has a configured timeout. Exceeding the timeout usually indicates an RFC connectivity issue or an unexpectedly large dependency tree.
| Phase | Class | Timeout | Typical Duration |
|---|---|---|---|
| Phase 1 | SimpleDependencyTable | 10 min | 2-5 min |
| Phase 2 | Phase2Recursive | 10 min | 3-7 min |
| Phase 3 | Phase3Descriptions | 10 min | 1-3 min |
| Phase 4 | Phase4FieldMetadataFixed | 120 min | 60-90 min |
| Phase 5 | Phase5SqlDefinitions | 15 min | 5-10 min |
| Phase 6 | Phase6SqlOnlyViews | 15 min | 5-10 min |
| Phase 7 | Phase7ViewMetadata | 30 min | 15-20 min |
| Phase 8 | Phase8ChainResolution | 60 min | 30-45 min |
| Problem | Cause | Fix |
|---|---|---|
| Java not found on sapidess4 | PATH not set for Java runtime | Run export PATH=$PATH:/usr/lib/jvm/java-17-openjdk/bin or check installation.sh |
| Empty GCS Bucket | connector.py did not find CSV files | Check CSV output in /usr/sap/cds_sql_only/drivers/; verify phases completed successfully |
| Duplicate rows in GCS Parquet | Pipeline ran twice without clearing previous output | Delete existing CSV files before re-running; use WRITE_DISPOSITION = WRITE_TRUNCATE |
| Compilation error on .java file | Missing classpath entries (sapjco3.jar, gson.jar) | Compile with: javac -cp .:sapjco3.jar:gson.jar ClassName.java |
| RFC_SYSTEM_FAILURE | SAP work process crash or function module error | Check SAP SM21 system log; restart the pipeline phase |
| RFC_COMMUNICATION_FAILURE | Network timeout or SAP gateway down | Verify SAP is running (sapcontrol -nr 03 -function GetProcessList); check JCo config |
| Phase timeout exceeded | Too many objects or slow RFC responses | Reduce root views in configuration.json; increase timeout in Java code; check SAP work process availability |
| Z_CDS_SQL_VIEWS returns empty | View name mismatch (CDS entity vs ABAP Dictionary name) | Use the ABAP Dictionary name (uppercase, no underscores); check SQL_NAME_MAPPING.csv |
| connector.py Fivetran SDK error | SDK version mismatch or missing Python dependencies | Check pip list | grep fivetran; ensure SDK is installed and connector.py uses correct API version |
| GCS upload fails | Service account permissions or bucket path wrong | Verify gs://sap_cds_dbt/sap_cds_views/ exists; check service account has Storage Object Creator role |