Industry Watch: Microservices and scaling out Big Data

Industry Watch: Microservices and scaling out Big Data

Industry Watch: Microservices and scaling out Big Data

This month’s column is a transcript of a fascinating conversation I had with MapR executives Jim Scott, the director of enterprise strategy and architecture at Hadoop solution provider MapR, and Jack Norris, senior VP of data and applications, on the subject of microservices and scaling Big Data.

SD Times: So, we know that scaling data can be a hassle. What is the impact of microservices on this issue?

Jack Norris: There are some complementary technologies that really are game-changing in terms of how to take advantage of [microservices]. The underlying data layer is an incredible enabler of microservices. If you’re doing microservices that are ephemeral and don’t require a lot of stateful data, then I think it’s pretty well understood and people can be quite successful with it. But the data issues drive a lot of complexity for the developers and for the administrators, and that’s an area that Jim has championed for quite a while, and his experience as an architect and a developer allowed him to grasp this and see it early on.

Jim Scott: There are two different ways to look at it when you look at the more ephemeral services. If you were to take just kind of a general front-end service that’s handling the primary load of a consumer-facing application, it’s probably not going to be doing a lot of work. It’s probably going to be handing off the workload to other services that are sitting behind it. Those services sitting behind it are the ones that are more likely to fall into this model. So, if you were to imagine companies building websites like Amazon, where it consists of 100-plus different service calls to a bunch of different back-end services, there’s the need to compile all the different information to bring back and build a user experience.

Read Also:
How Technological Advancements are Beneficial for Sharing Economy

When you start looking at those services, being able to have a linearly scalable back-end data flow is pretty important. As you scale out your services, which are going to be doing some of the work, they need to figure out who the user is, what information is relevant to them, they then need to give that information back to the front end to render a front end for the user. The compilation of those different data sets is pretty important. Being able to scale out that tier that is intelligent, where it’s clearly doing some level of computational work, is one thing. But in the same vein, without the data that it depends on, it can’t really do anything.

So, as you scale that service up, you will see how much work each instance of that microservice can perform. You know your scaling factors, and then you know based off of how many different services you have what your workloads are on your back-end data platform, and so when you exceed the total capabilities, you just add another server to that cluster. The same goes for whether it’s a streaming capability, a database capability or a file system capability.

Read Also:
Big Data investment is up, for how long?

Those microservices, when you imagine for just a moment when you start deploying microservices, if you are the software engineer, you need to have visibility into your services. And that is to say, how fast are they performing? Are there bottlenecks? Are there certain types of requests that are coming in that are causing errors? So, when you look at performance and application monitoring, you must be able to emit data from these instances of microservices so that you can troubleshoot. In the old troubleshooting model, we typically did that by doing complete isolation for different servers, and then each server had its own logs, and you could just trace it that way. Trouble is, that doesn’t scale very well from a cost perspective.

The great thing is, if you imagine how it was done last year, or five years ago, or 10 years ago, however far back you want to go, they were the equivalent of multipurpose applications, monolithic if you will, and those applications had a long life cycle to be able to get updates into them. And the scaling factors for them were all or nothing.

Read Also:
Hortonworks seeks salvation in proprietary software

 



Chief Analytics Officer Europe

25
Apr
2017
Chief Analytics Officer Europe

15% off with code 7WDCAO17

Read Also:
Customer analytics brings your audience into view

Chief Analytics Officer Spring 2017

2
May
2017
Chief Analytics Officer Spring 2017

15% off with code MP15

Read Also:
Big Data investment is up, for how long?

Big Data and Analytics for Healthcare Philadelphia

17
May
2017
Big Data and Analytics for Healthcare Philadelphia

$200 off with code DATA200

Read Also:
Getting the business of Big Data ‘right’

SMX London

23
May
2017
SMX London

10% off with code 7WDATASMX

Read Also:
Big Data investment is up, for how long?

Data Science Congress 2017

5
Jun
2017
Data Science Congress 2017

20% off with code 7wdata_DSC2017

Read Also:
What is the promise of big data? Computer will be better than humans

Leave a Reply

Your email address will not be published. Required fields are marked *