Exploring the SAP Data Lake – Architecture and Features
SAP launched the HANA Data Lake in April 2020 in a bid to further strengthen its cloud-based ecosystem and provide customers with a cost-effective and highly optimized storage system. The package of benefits included a native storage extension and a relational SAP data lake. Soon the data lake was considered to be in the same league with the leaders in the niche, namely Amazon S3 and Microsoft Azure because of its powerful data processing capabilities and functionalities. The architecture of the SAP data lake is like a pyramid. At the top is stored data that is critical for businesses and used constantly. The cost of storing this data is, therefore, the highest on SAP data lake. The middle of the pyramid stores data that is not regularly used but important enough not to be deleted. This data is not as high performing as the top tier and access requirements are also quite low. The storage cost of this tier is significantly less. The bottom of the pyramid holds data that is hardly e