How big data is changing the game for backup and recovery

How big data is changing the game for backup and recovery

It's a well-known fact in the IT world: Change one part of the software stack, and there's a good chance you'll have to change another. For a shining example, look no further than Big Data.

First, Big Data shook up the database arena, ushering in a new class of "scale out" technologies. That's the model exemplified by products like Hadoop, MongoDB, and Cassandra, where data is distributed across multiple commodity servers rather than packed into one massive one. The beauty there, of course, is the flexibility: To accommodate more petabytes, you just add another inexpensive machine or two rather than "scaling up" and paying big bucks for a bigger mammoth.

That's all been great, but now there's a new sticking point: backup and recovery.

"Traditional backup products have challenges with very large amounts of data," said Dave Russell, a vice president with Gartner. "The scale-out nature of the architecture can also be difficult for traditional backup applications to handle."

Read Also:
The future of big data federation may have just landed

Today's horizontally scalable databases do include some capabilities for availability and recovery, but typically they're not as robust as those IT users have become accustomed to, Russell added.

It's a problem that can leave large enterprises vulnerable when outages strike. But it's also where a new class of data-protection products is beginning to enter the picture.

Datos IO's RecoverX is one of those.

"If you have a traditional database like Oracle or MySQL, it's scale-up, and there's always the notion of a durable log," said Tarun Thakur, Datos IO's co-founder and CEO.

In such scenarios, a copy of that log is what constitutes a backup when problems arise.

In the world of today's next-generation databases -- where data is distributed across small machines -- it's not quite so simple.

"There is no concept of a durable log because there is no master -- each node is working on its own stuff," Thakur explained. "Different nodes could get different rights, and every node has a different view of an operation."

Read Also:
The Year Data Streaming Becomes Mainstream

That's in part because of a trade-off that's been required to accommodate what's commonly referred to as the "three V's" of big data -- volume, velocity, and variety. Specifically, to offer scalability while accommodating the crazy amounts of diverse data flying at us at ever-more-alarming speeds, today's distributed databases have departed from the "ACID" criteria generally promised by traditional relational databases. Instead, they've adopted what are known as "BASE" principles.  

It's a critical distinction.

 



HR & Workforce Analytics Summit 2017 San Francisco

19
Jun
2017
HR & Workforce Analytics Summit 2017 San Francisco

$200 off with code DATA200

Read Also:
More Organizations Kicking the Tires of Spark As Data Tool of Choice

M.I.E. SUMMIT BERLIN 2017

20
Jun
2017
M.I.E. SUMMIT BERLIN 2017

15% off with code 7databe

Read Also:
3 keys to keep your data lake from becoming a data swamp

Sentiment Analysis Symposium

27
Jun
2017
Sentiment Analysis Symposium

15% off with code 7WDATA

Read Also:
How to Tell When You Need a Better Analytics Platform
Read Also:
Can Machines Deep Learn Project Management?

Data Analytics and Behavioural Science Applied to Retail and Consumer Markets

28
Jun
2017
Data Analytics and Behavioural Science Applied to Retail and Consumer Markets

15% off with code 7WDATA

Read Also:
How to Optimize Analytics for Growing Data Stores

AI, Machine Learning and Sentiment Analysis Applied to Finance

28
Jun
2017
AI, Machine Learning and Sentiment Analysis Applied to Finance

15% off with code 7WDATA

Read Also:
How Big Data helps banks know their customers better -

Leave a Reply

Your email address will not be published. Required fields are marked *