Lake Effects – In a Data Lake

Jul 20, 2022 | Articles, Featured

With our Earth becoming increasing hot, world-wide we see a steep decline in availability of water – especially, lakes that have been the lifeline of citizens. Bengaluru is no exception – A recent study by IISc has found that lakes in Bengaluru are drying up even during monsoons.

However, there is one lake that continues to grow bigger – not just in Bengaluru, but world-over: “Data Lake”.

Data Lake is a storage repository that can store large amounts of structured, semi-structured and unstructured data. This is an architectural pattern. Just like how a lake gets filled with water – clean water, dirty water, polluted water – coming from different sources / tributaries, a Data Lake allows storage of data coming from different sources and in different formats.

Unlike traditional methods of data storage like Data Warehousing – where data is stored in a hierarchical format (files / folders), Data Lake is a pretty flat-structure, thereby not imposing any constraints on the analyst who is trying to make inferences using the data.

Just “real-world lakes” lead to a meteorological phenomenon called “Lake Effect” – wherein warm moist air from lake surface rises upwards and mixes with cold, dry air to produce precipitation / snow fall.

Similar to this, Data Lakes also produce unintended consequences. Some of them being:

  • If not used effectively, Data Lakes could soon get flooded with data – thereby losing its relevance
  • Additionally, if enough oversight is not put in place – Private / Regulated data could end up in the Data Lake

IIITB is planning to setup a Data Lake pilot – with funding from Government of Karnataka. This Data Lake will be under the custody of GoK’s Planning Department and will hold data of different departments of GoK. This Data Lake pilot will implemented by IIITB, jointly with the Centre for Open Data Research (CODR) of Public Affairs Committee (PAC), another non-profit organization.

For more questions on:

  • Data Lake
  • Data Lake Pilot
  • Open Data Research Initiative of GoK

Please contact “Office of Research, IIITB” at ora@iiitb.ac.in.