Businesses are increasingly looking to accelerate data-driven insights they can use to differentiate products, improve customer experience, and grow the business. But sometimes, this insatiable desire creates a mad rush for more and faster data lakes – and that can simply cause more harm than good.
That’s the view of execs from Informatica, who suggest one major downside is the ‘data swamp,’ which contains murky, under-managed, and unreliable data. Questionable data from a data swamp, if ingested or relied upon too heavily, can prove harmful to the business, execs contend.
To cope with this risk, Informatica is releasing technology to better manage Hadoop-driven distributed data lakes. Informatica Data Lake Management “integrates, governs, and secures the data an organization needs to power its business with a data lake,” according to the company’s website. Informatica’s chief product officer, Amit Walia, explained the company’s approach comes as many customers are struggling to keep pristine and reliable data lakes from turning into ‘data swamps’ that contain “inconsistent, incomplete and stale data.”
The idea behind Informatica Data Lake Management is to deliver several key benefits including:
With Data Lake Management, Informatica helps organizations quickly and repeatedly turn big data into trusted information assets that deliver sustainable business value. The solution is architected to also enable data analysts to more easily find, prepare, govern, and protect data of any size. They or It also drives (or derives?) business value from Hadoop-based data lakes.
In part, Walia said that hand coding and code generation tools sometimes used to manage data lakes struggle to discover relationships between datasets.