Transient Clusters in the Cloud for Big Data

Transient Clusters in the Cloud for Big Data

Transient Clusters in the Cloud for Big Data

Cheaper, faster. Faster, cheaper. When it comes to getting value from Big Data, paying less and processing it faster to reduce time-to-insight are always top-of-mind goals. To achieve these goals, many enterprises are turning to the cloud to augment their on-premise Hadoop infrastructure or replace it.

One key reason for the shift is that Hadoop in the cloud allows for the decoupling of storage and compute services, so enterprises can pay for storage at a lower rate than for computing services. Also, the cloud provides the unlimited scalability that on-premise architecture can’t. With cloud services like AWS EMR or Microsoft Azure HD Insight, enterprises can spin up and scale Hadoop clusters on demand. Have a job that isn’t processing fast enough? Add more nodes and then scale back down when it’s done. Have several jobs of various sizes? Run multiple clusters of exactly the size needed so that no resources are wasted. Add transient clusters to the mix, and the cloud becomes an extremely customizable Big Data solution.

Read Also:
Facebook explains why it’s betting big on AI

Transient clusters are compute clusters that automatically shut down and stop billing when processing is finished. However, using this cost-effective approach has been an issue in the past, as metadata is automatically deleted by the cloud provider when a transient cluster is shut down.

 



Predictive Analytics Innovation summit San Diego
22 Feb

$200 off with code DATA200

Read Also:
Bulk data collection only lawful for fighting serious crime, says Europe’s top court
Read Also:
IoT success depends on data governance, security and privacy
Read Also:
IoT success depends on data governance, security and privacy
Read Also:
How Can Lean Six Sigma Help Machine Learning?

Leave a Reply

Your email address will not be published. Required fields are marked *