It’s ETL Jim, but not as you know it. A brief journey into SAP’s cloud based Data Intelligence
Author – Mike Coe, Senior Analytics Consultant, NTT DATA Business Solutions UK&I
It does not seem that long ago to me that companies would install a database and then purchase an ETL tool to populate it with data from various systems. It all worked well and everyone was happy. Well maybe not happy but accepting that this was the best way to consume data and produce reports. Keep everything in-house, employ software developers and DBAs and “own it”.
Cloud computing has challenged that approach and providers now offer tenanted solutions that you can sign up to as easily as Netflix. You no longer own, you subscribe and you can pick and choose your services to suit.
The one big advantage of using tenanted solutions is that you can always be on the latest version with the latest functionality available. You will be at the cutting edge of technology. Great, but that sounds painful – why do I want to be there? Is everyone moving to these cloud based solutions for Fear of Missing Out? Or is there more to it?
Over the last year, I have been getting to grips with SAP Data Intelligence. This blog is my personal journey into why I now recommend the move from traditional ETL tools to SAP’s cloud based data management solutions.
Day One… and a few more days after that
I sat down for my Data Intelligence (DI) training. I am told that it’s still ETL but now they are called Pipelines. OK. Well actually, they are not pipelines, they are Graphs. They’re what now?
Don’t worry, inside these Graphs are just the same kinds of transformations you’re used to. They all join up in the same way, just make sure your input and output ports are the same… or at least compatible. Ports? I feel like I am being reassured and tormented at the same time.
I try to keep up with the new terminology and different ways of doing things whilst at the same time wondering “What are the benefits to clients in making this transition?”
Data Intelligence has lot of functionality and there is a lot to learn but I get there, I can now use these new tools. I was still missing something though – why do I need to use these new tools? The mistake I was making was trying to implement the same old ETL routines I have always done but using a new and unfamiliar interface.
Captain’s Log, some point during 2021
I do not remember exactly when but it starts to dawn on me – I need to think differently about how data is consumed and managed.
Historically, ETL was viewed as a link between systems but not a system in itself. Very much a second class citizen within an organisation’s library of technologies, ETL has never benefitted from the same level of focus as the systems it interfaces with.
A paradigm shift is required to consider data management as a system in itself that straddles across the system landscape rather than a conduit between systems.
SAP DI is part of a new breed of ETL solutions that goes beyond the task of moving data from A to B. Combining data integration with data governance and even machine learning.
Features of Data Intelligence
As with any cloud solution, you have the ability to flex and scale the SAP DI tenant as required. Having a tenanted solution that is always updated means that feature updates and software patches are handled for you in frequent releases. The list of connectors and adaptors will be updated regularly ensuring that new types of data sources can be consumed.
The available connectors and transformation abilities of an ETL tool are of primary importance. As more and more data sources move to the cloud, the configurable operators within DI become invaluable. Google Pub/Sub, Kafka, AWS SNS and MQTT area all supported now and every release so far has added more supported connections. Although other ETL tools such as SAP’s own Data Services can support messaging data sources, they are custom transforms that utilise Python and require a bigger development effort than configuring an operator within DI.
Of course, SAP DI is not limited to cloud based sources and targets, it can connect to on-premise systems using a cloud connector or site to site VPN.
Not surprisingly, SAP DI has inbuilt operators for SAP Solutions and also supports all major databases. Depending on the operator, there are different configurations available. SAP DI also offers Python, R, Go and other open source operators
When it comes to the routines you build with the tool, re-usability and resilience are key. Why does Data Intelligence bundle the ETL routine into a container? So it’s re-usable and resilient. The DI pipeline is now an application in its own right with control over its dependencies and it can be re-deployed much more easily.
I won’t digress into the importance of data governance here as that is well covered elsewhere but SAP DI has been designed with data governance in mind. DI’s metadata explorer allows you profile and publish datasets and apply validation rules. DI has a built in monitor for the status of publishing, profiling, lineage, rule and data preparation tasks.
The Data Catalog can be used to build a business glossary with business rules and data ratings. As you can imagine, the more you put into this, the more you get out. This isn’t simply a repository of text descriptions for business terms, the relationship to data assets can be visualised.
Image source: SAP® Integration Solution Advisory Methodology (Published by SAP)
Moving to a cloud based data management solution
SAP are investing heavily in Data Intelligence and it will become an integral part of their data management solution. Data Intelligence is part of SAP’s Business Technology Platform portfolio and independent analyst firm Gartner Inc. has named SAP a leader in its Magic Quadrant for Data Integration Tools.
NTT DATA Business Solutions can help you understand why there is a need for a data management solution and why SAP’s cloud based data management solution becomes the obvious choice for your organisation as applications and data sources are migrated to the cloud.
Where are you on your data management journey?
NTT DATA Business Solutions has drawn on their considerable analytics expertise to create a free of charge ‘Analytics Pulse’ to measure how effectively your business uses data and analytics. We will assess your data culture, data usage and your data maturity in managing the modern day challenges around data. Our analytics experts will then analyse the findings and identify actions to help you improve the value and return on investment in data and analytics.
To take advantage of this free service, please register your interest here.
Meet our experts at Transformation NOW! 2022
Our annual conference is back live and in-person. Make sure you register for the chance to find out more about what is happening in the world of SAP in relation to data and analytics.