Exploring the SAP Data Lake – Architecture and Features

 SAP launched the HANA Data Lake in April 2020 in a bid to further strengthen its cloud-based ecosystem and provide customers with a cost-effective and highly optimized storage system. The package of benefits included a native storage extension and a relational SAP data lake.

Soon the data lake was considered to be in the same league with the leaders in the niche, namely Amazon S3 and Microsoft Azure because of its powerful data processing capabilities and functionalities. 

The architecture of the SAP data lake is like a pyramid.

At the top is stored data that is critical for businesses and used constantly. The cost of storing this data is, therefore, the highest on SAP data lake.

The middle of the pyramid stores data that is not regularly used but important enough not to be deleted.

This data is not as high performing as the top tier and access requirements are also quite low. The storage cost of this tier is significantly less.

The bottom of the pyramid holds data that is hardly ever used and would have been deleted in traditional databases. But in SAP data lake, you can still keep this data at rock-bottom prices if ever you need it. The tradeoff is that access to this data is very slow.

The advantage here is that businesses get to decide where to store their data and optimize storage costs.

There are several cutting-edge functionalities of the SAP data lake.

Since it operates in the cloud, users get access quickly to other leading cloud providers like Amazon S3 and Google Cloud Platform. Further, the SAP data lake is very flexible and can easily scale up to petabytes of data if required.



Comments

Popular posts from this blog

Change Data Capture Activities to a SQL Server Table

Oracle CDC Software and Technology