Having herself held senior roles in IT at Wall Street companies including Deutsche Bank and Morgan Stanley Smith Barney, Oksana Sokolovsky is quite familiar with the challenge of Data Management and data discovery. As co-founder and CEO of ROKITT, her goal was “to build a product that solves that challenge,” she says.
The challenge exists across large enterprises in multiple industries, but is often especially acute in those dealing with regulatory pressures and compliance requirements – healthcare, for instance, and of course, the financial sector. Basel Committee on Banking Supervision (BCBS) 239 compliance for effective risk data aggregation and reporting, for example, is a big driver of improved Data Management for global systemically important banks.
In fact, a McKinsey & Company and Institute of International Finance survey showed that more than half of the world’s biggest banks faced significant challenges meeting the January 1, 2016 deadline for compliance, with the Global Association of Risk Professionals commenting that “many institutions continue to struggle to fully implement the requirements across the business under the most demanding interpretation of those requirements.”
ROKITT’s Astra solution, Sokolovsky believes, can help banks support adherence to both internal and external regulations and policies, like BCBS 239, across complex data landscapes, as well as support other use cases for a variety of enterprises: Data asset governance to better utilize data to enhance business value, for example.
Astra debuted in March after a year of development. What sets the technology apart, according to Sokolovsky, is its ability “to let customers discover data and information about data and its relationships, in order to manage data better or meet regulations more efficiently, using our custom-built machine learning concepts and other advanced algorithms.” Astra’s algorithms automatically discover and self-learn data relationships with up to 90% accuracy, the company says.
It can recognize, for example, connections or dependencies between values within a database that may not be obvious – perhaps that a column in one table contains data that refers to a column in another table, and why that relationship exists. It will learn that two columns in different tables that both carry customer information, but are called “Customer” in one instance and “Company” in another, for example are the same. It will then establish the relationship that ‘customer’ is the primary key and ‘company’ is the secondary key, she says, and apply that knowledge from that point on across databases, XML documents, and flat files. Columns don’t even need to have reasonable names like this for the solution to figure out the relationship – if one table was called ‘xyz’ and another ‘qwerty17,’ it can still figure out that both hold customer information.
The system reads data in repositories and learns its true Metadata. In fact, applying its Machine Learning algorithms to the data itself, rather than just to Metadata, is critical: As companies and systems grow, Metadata only holds so much information about data, Sokolovsky says, limiting possibilities. Applying Machine Learning algorithms to the data itself expands the ability to discover data relationships beyond the 10 to 20 percent possible when such algorithms are applied to Metadata only. The data discovery process, she says, is fast, too, using its next-generation async processing architecture. “It’s measured in hours and minutes,” says Sokolovsky.