NTT DATA Business Solutions
NTT DATA Business Solutions | March 4, 2024 | 10 mins

Navigating SAP Datasphere: Unveiling the Good, the Not-So-Good, and the Future

SAP Datasphere, formerly known as Data Warehouse Cloud, has evolved into a robust cloud data platform, showcasing continuous growth and adaptability. With seamless updates and improvements over the past two years, it offers a streamlined experience, although some areas may still require refinement in the future.

SAP Datasphere isn’t just a data platform; it’s a modern SaaS solution that weaves together Connectivity, Storage, ETL (Extract, Transform, Load), and Analytical capabilities. But let’s look beyond the technical jargon. After nearly two years of hands-on experience, I’ve come to appreciate both its strengths and areas that warrant a closer look.
In this blog post, I’ll unveil my top five favourites—those features that make working with Datasphere a joy—and also highlight the five aspects that still need a gentle nudge in the right direction. So buckle up, fellow data enthusiasts, as we explore the multifaceted world of SAP Datasphere!. (Warning this next bit might get a little technical 😊)

What is SAP Datasphere?

SAP Datasphere – previously known as SAP Data Warehouse Cloud – is a cloud tool which is in some ways analogous to Microsoft Fabric.
You can connect to sources of data; build views graphically or via SQL directly; store instances of the source data or the evaluated data; and provide that data for storage or publication in front-end tools.
It’s not just an SAP tool, but has some specific capabilities for connection to SAP systems.

My favourites bits:

1 – As a Cloud-based (Saas) application, Datasphere includes updates and new features automatically – we get continuous improvement.

The last two years have yielded significant improvements in functionality. Here are a few examples:-

  • A very recent update (Sep 23) added capabilities to determine delta changes within SAP Datasphere. This means you can use the new Transformation Flow to update subsequent data targets with data changes.
    • Introduction of different instances of tables with different filter criteria. In the SAP environment, you could have a view called Contracts linked to SAP table EKKO (Purchase Order Header), which selects order type ‘K’ (contract) and a separate view called Purchase Orders selecting order type ‘F’. These selections are passed to the query retrieving data from the source system. Very handy for large tables, with different usage
    • Enhanced Data Viewer features: Originally without a lot of functionality, the Data Viewer has improved. When choosing to view the data, the filtering options have undergone significant improvements, offering a more comprehensive experience. You can now also adjust the filters, columns and sorting in one step, previously requiring a refresh between each action.
  • View Functions are gradually improved, RANKING and ROW_NUMBERing being amongst the more recent.

 

2 – View definition and general functionality deliver everything they should, e. I’m not experiencing functionality gaps or lack of reliability.

Maybe that doesn’t sound like much, but really, we need to get the job done without stumbling over product bugs and massive functionality holes. Datasphere is relatively new, so this is a real plus in a world of rapidly changing cloud software.

By way of example: –

  • Graphical views work as expected, and SQL views allow you to do pretty much everything else that I need;
  • Dataflows and Task Chain functionality are simple and reliable;
  • Connections to sources are really pretty simple to configure
  • The browser-based interface requires no additional components to be installed or updated – gone are pesky client tool updates
  • Connecting to get data into reporting tools like Power BI or Power Automate is easy, and performance is as I would expect, depending upon dataset size

 

3 – Datasphere supports Table Replication and View Persistence. It’s possible to achieve a balance between Real-Time reading of data and Replication of data,within reason.

  • Replication of source tables very simply gives a significant improvement in performance. While smaller datasets including some texts and smaller master data tables can be read in real-time from the source, big tables can be replicated into Datasphere.
  • You can also Persist Views in Datasphere. This gives a layer of transformed data. Persisted views can push the selection criteria back into the source database, which is particularly useful when dealing with large source tables (for example BSEG or MSEG in the SAP environment).
  • When you have a very complex view, perhaps relying on a mix of Real-Time and Replicated sources, View Persistence can also create an instance of this transformed view of data, that can then be used to support reporting or further analysis.
  • So you can use a combination of Real-Time, Table Replication and View Persistence to achieve performance and reliability.

 

4 – Data Flows and Task Chains are easy to use and reliable.

Well, reliable as long as the source system is not undergoing maintenance 😊

I find Data Flows and Task Chains pretty simple to set up; not as many options, perhaps as pipelines in Microsoft ADF, but they are reliable and easy to configure. A handy recent addition is the ability to pause schedules, for example, to allow for source system maintenance.

 

5 – Data Viewer is an essential tool which has been improving.

Data Viewer is a visualisation tool I tend to use heavily in my development work. Data Viewer has improved to be able to see an entire dataset – not just limited to 1000 records –  although it would be easier if it were possible to select a record limit. Still, you can use Data Viewer in the SQL view definition for that.

The functionality of the Data Viewer has been improving, as I mentioned in my first Like.

 

My not-so-favourites bits:

1 – The web-based interface tends to become ‘forgetful’ after a few hours of use; edits can be lost unless saved/activated immediately, or modifying the layout of graphical views becomes a problem.

Although the browser-based interface seems perfectly functional to me, after using it for several hours, it starts forgetting your edits, so you need to save frequently.

For example, when editing a graphical view, you change a formula; you go out of the formula without saving; go back into the formula, your change has not been remembered; or if you change the layout of a graphical view, it starts dragging objects you didn’t select. Annoying.

But the fix is quite simple: save your work frequently (probably good practice in any content-creating software 😊), and simply close your browser session and start a new one when these problems occur.

 

2 – The formatting of views changes when inserting a new object, and there’s no automatic way to readjust the layout.

From my perspective, the default layout in the graphic view leaves room for improvement. Our team tends to favour a more square and orderly arrangement. However, there’s a hiccup: whenever we add or delete an object, the layout stubbornly reverts to its default state. Unfortunately, there’s no automatic adjustment feature available.

For views of any size, this translates to extra manual work in reorganizing the layout—a less-than-ideal situation

 

3 – No Transporting between Spaces

Change Control is something that all SAP products do really well, and whilst there is a formal Transport between Datapshere tenants, there is no way to transport between Spaces within one tenant. Why would you need to do this?

Many customers will only purchase a 2-tier Datapshere landscape, i.e. Production and Non-Production, yet still want to emulate a 3-tier development lifecycle, i.e., Dev, Test and Prod. This is simply achieved in the Non-Production tenant by having Spaces dedicated to Development and Spaces (where objects are duplicated) dedicated to testing – hence the need for a transport mechanism from a Development Space to a Test Space.

The only way to do this at present is to:-

  • export a JSON file
  • adjust it manually for any views or tables that should be pointed at different Spaces
  • import into the target Space.

A bit error-prone and time consuming. Not ideal.

 

4 – Error handling in task chains

The ability to continue after an error has occurred or add an action like email would be very handy and should be introduced.

 

5 – Some Standard SAP ERP data types don’t appear to map to a useful data type in Datapshere, and there is no function for Conversion

Long Text data types spring to mind here. Raw data fields such as STXB-CLUSTD don’t appear to map to a datatype or have an appropriate function for conversion.

SAP Datasphere: A Journey of Evolution and Promise

SAP Datasphere stands tall as a well-rounded, fully functional cloud data platform that has come a long way since it first versions as Data Warehouse Cloud. Its journey has been marked by continuous growth and adaptability. Notably, the benefits of regular updates to this SaaS product are crystal clear. Over the past two years, I’ve witnessed firsthand how improvements seamlessly propagate across the landscape without the need for individual upgrades, patches, or regression testing cycles. However, like any evolving technology, there remain areas that require refinement. These minor grievances, I believe, will soon find their solutions in the not-too-distant future.