SAP Datasphere: Automation of modeling

Share post via

SAP Datasphere offers a very easy-to-use interface for developing data models. However, especially for implementation projects with a very high number of data models, the interface quickly reaches its limits because the development process takes a lot of time.

There is an alternative way to configure Spaces, including data models, via the Command Line Interface (CLI) of SAP Datasphere. Using the example of our Automation Framework, we show what opportunities arise from the CLI and automate parts of data modeling in SAP Datasphere.

Problem definition

SAP Data Warehouse is to be introduced. Initially, tables from the source systems are to be provided in central ingest spaces for each source system. Of course, the necessary remote tables etc. can be developed manually. As the number of tables increases, the amount of manual repetitive work increases. Added to this is the requirement that changes to the structures of the source tables in Datasphere should also be adopted. This can be done through development processes and manual activities.

The problem
Fig. 1: Problem definition | isr.de

However, the process of creating all the necessary spaces, including remote tables, would be very lengthy, depending on the number. Departments have to wait longer for results and can benefit from Datasphere later. This raises the question of whether there is a better way than performing recurring tasks manually.  

Excursus Command Line Interface (CLI)

The SAP Datasphere interfaces are easy and intuitive to use. This is very good for smaller work packages and the business in particular. On the other hand, it would be ideal if Datasphere offered an alternative programmatic approach to carrying out developments in Datasphere.

This is where the Command Line Interface (CLI) comes in. The CLI is a standalone Node.js module and is available at npmjs.com. The CLI allows access to Datasphere Tenants via the command line. "Bulk work" or recurring tasks that require a lot of manual work can be accelerated in this way. Examples include

  • Reading the space definition (tables, members, etc.)
  • Creation of spaces, tables, views, etc.
  • Assignment of users to Spaces

Jascha Kanngiesser has also described the possibilities in detail here. JSON is used as the format, i.e. the definitions are available in this format. Below we present a few examples of use cases:

Integration in Git

In SAP Datasphere, there is currently no version management of development objects as developers know it from software development, such as when using Git. The definition of Spaces can be exported to a JSON file via the CLI. If the JSON files are stored in Git, you get a proven version management system. Automation of versioning (export JSON and import Git) is conceivable using third-party solutions (e.g. ISR Automation Framework), but is not available out-of-the-box. Git integration in the Datasphere interface is currently not foreseeable.

Graphics - Command Line Interface (CLI)
Fig. 2: Command Line Interface (CLI) | isr.de

Developer Spaces / One Tenant Strategy

This scenario is based on a functioning Git integration. Currently, nothing is locked in Datasphere when a developer edits a view, for example. This means that a second developer can edit the same view at the same time. This entails some restrictions and requires clear development processes.

The following scenario would be interesting:

Graphics: Developer Sandbox
Fig. 3: Developer sandbox | isr.de

Entwickler ziehen sich die Definition des produktiven Space (JSON) aus Git und es wird unter einem anderen Namen (z.B. DEV1_<SPACE_NAME>) ein Space erzeugt zur Entwicklung. Die Entwicklung wird durchgeführt und abgenommen. Anschließend wird ein Merge in Git durchgeführt mit möglichen anderen Entwicklungen eines Releases. Die abgestimmte Space Definition wird anschließend per CLI eingespielt in dem produktiven Space. Einige Schritte müssten hierbei automatisiert werden, um effizient zu arbeiten (z.B. Namensänderung Entwickler-Space, Erzeugung Entwickler-Space, etc.).

You can pursue a "one tenant" strategy or work with several tenants. We find this scenario very exciting because it allows large teams of developers to work in parallel and it doesn't end in chaos.

Unfortunately, the scenario currently fails because it is not possible to merge the JSON files. Therefore, the procedure described only shows the potential of the solution. It will be interesting to see whether merging will be possible in the future. Then even larger development teams can be coordinated well.

Alternative transportation

The transport system in SAP Datasphere is very easy to use. Please note that the technical names on the source and target tenants must be identical with regard to naming conventions. This means that the connections and spaces must have the same name as on the production. There is no renaming or mapping. There is often a requirement to use system abbreviations as names in the connections, just as with ingest spaces. Now it could be confusing that on the Dev / QA Tenant Connections and Spaces have names with productive system abbreviations.

Renaming" can be realized with the CLI. The definition is exported via CLI on the Dev/QA Tenant. In the JSON file(s), the objects must be renamed accordingly and then imported into the productive Datasphere via CLI. This is certainly a very simple example, but it shows that the CLI gives you a high degree of flexibility for more complex scenarios.

In general, automation should also be provided for the process in this scenario in order to avoid manual effort as well as errors. At the same time, it is conceivable that the deployment could be integrated into a CI/CD process in order to automate such adjustments and the deployment itself. 

Fig. 4: Transportation | isr.de

Automation of modeling and quality checks

Let's come back to the problem where a mass of objects have to be created in Datasphere. Unsurprisingly, we also see this as a very good use case for the CLI. All that is needed are the appropriate JSON files for import into Datasphere. An excerpt of such a JSON file is shown below.

Extract from a JSON file
Fig. 5: Screenshot | isr.de

The manual creation of a JSON file including the names of the tables etc. does not speed anything up at first. Especially if further views are to be created that are based on the remote tables and all columns are to be defined. This only makes sense if the creation of the JSON files and, if possible, the import into Datasphere is automated. The process can be automated using the ISR Automation Framework.

Graphic - ISR Automation Framework
Fig. 6: ISR Automation Framework | isr.de

The Automation Framework is a node.js application that runs locally or on a virtual environment. Data models can be created in SAP Datasphere in just a few minutes using the framework and the CLI. In addition to the initial creation, an automatic regular synchronization of source and target structures can also be implemented.

Below we outline how the framework works together with SAP Datasphere and generates the data models in SAP Datasphere:

Graphic - SAP Datasphere process in detail
Fig. 7: Process in detail | isr.de

1) Admin Space

An Excel file is used to define which source tables are required from which system. This file is saved as control table(s) in a dedicated space. The control tables specify which remote tables / views are to be created in which spaces in SAP Datasphere.

2) Determine requirement

The control tables represent the requirement for the Automation Framework. OpenSQL access to the Admin Space is used to query which objects are to be created in SAP Datasphere.

3) Determine metadata

The framework is also connected to the source systems. This makes it possible to query the metadata of the source tables. For example, this also validates that a table exists at all, but also which columns it contains.

4) Creation of JSON file for the command line interface

With the help of the previous steps, the necessary JSON files can be generated automatically. To do this, a script is run that recognizes the necessary syntax of the JSON file from Datasphere and generates the JSON.

5) Creation of data models in SAP Datasphere

Finally, the JSON file is imported into SAP Datasphere via the command line interface and all objects are created. Dependencies between Spaces can also be created if, for example, views in a Consumption Space are to point to the remote tables in Ingest Spaces.

Below you can see a screenshot of the interface in which the necessary information is stored.

Screenshot from the DWC interface
Fig. 8: Screenshot | isr.de

Automatic processes should be monitored regardless of this. The CLI can be used to query the space definition after import into Datasphere and compare it with the expected result (see screenshot below). In the case of replications, the quality checks can be extended to a comparison of the source with the target table and similar tests.

Screenshot ISR Automation Framework
Fig. 9: Screenshot | isr.de

Have we piqued your interest? Read our reference report on a large Datasphere project in which considerable acceleration was achieved by automating the modeling.

Operational real-time reporting with SAP Datasphere
SAP Datasphere is very well suited as a self-service platform for specialist departments.
A landscape with wind turbines

Conclusion

The Command Line Interface offers great potential for complex scenarios and the automation of activities. It is exciting to see how the functional diversity of the Command Line Interface will develop. Many scenarios for automating and mapping complex scenarios can be tackled in this way. The example of the ISR Automation Framework clearly shows that the use of the CLI can result in high savings potential, for example in the following areas:

  • Generation of Spaces
  • Generation of data models
  • Generation of users
  • Comparison of the source systems with the Datasphere data models
  • Carrying out quality inspections

We hope that our article has given you a good overview of the potential of the Command Line Interface.

ISR employee picture

Christopher Kampmann
Head of Business Unit
Data & Analytics
christopher.kampmann@isr.de
+49 (0) 151 422 05 448

About ISR

We have been operating as IT consultants for data analytics and document logistics since 1993 and focus on data management and the automation of processes.
We provide holistic support within the framework of comprehensive Enterprise Information Management (EIM), from strategic IT consulting to specific implementations and solutions through to IT operations.
ISR is part of the CENIT EIM Group.

Visit us virtually on these channels:

News Categories
News archive

Last published

Next ISR Events

[tribe_events_list limit="3″]