While data lakes are still a new phenomenon, they have gained some acceptance from IT departments lately as data increasingly becomes the backbone of modern business. Lakes are seen as a solution to reduce data sprawl and isolation. They spun off from data warehouses that were supposed to help IT departments create organized repositories of strategically important datasets for making key business decisions. This data can be used to solve a wide variety of problems, from analytics and better understanding of customer needs to the use of artificial intelligence for decision-making in real time.
Data lakes represent the further evolution of storage. Many projects for the creation of the latter have failed: they turned out to be too expensive, took too much time and allowed only a few of the goals to be achieved. Data is changing and growing so rapidly that the need to immediately benefit from it has become even more pressing. No one can afford to spend months or years analyzing and modelling data for a business. By the time the data in the warehouse becomes available for use, the needs of the business are already changing.
Data marts, like data warehouses, were created for data intended for use for specific purposes or with specific properties (for example, for data from a marketing department). They have gained in popularity because the use of the data is clearer, and results can be delivered faster. However, they share the data, which has made the data mart less useful for companies with huge amounts of data and requiring multifunctional use by many employees.
In this regard, data lakes have been developed to speed up the work of data and make it easier to use to meet needs that were not previously identified. The advent of clouds, providing cheap computing power and virtually unlimited storage, has made the creation of data lakes possible.
State University of Telecommunications offers students the opportunity to study the nuances of creating and maintaining a data lake using modern equipment provided by partners.