The Hardest Part About Microservices: Your Data

The Hardest Part About Microservices: Your Data

The Hardest Part About Microservices: Your Data

Continuing on with my series about microservices implementations (see "Why Microservices Should Be Event Driven," "Three things to make your microservices more resilient," and "Carving the Java EE Monolith: Prefer Verticals, not Layers" for background) we're going to explore probably the hardest problem when creating and developing microservices. Your data. Using Spring Boot/Dropwizard/Docker doesn't mean you're doing microservices. Taking a hard look at your domain and your data will help you get to microservices.

Follow along for the rest of the series (Twitter: @christianposta, RSS/blog: blog.christianposta.com).

Of the reasons we attempt a microservices architecture, chief among them is allowing your teams to be able to work on different parts of the system at different speeds with minimal impact across teams. So we want teams to be autonomous, capable of making decisions about how to best implement and operate their services, and free to make changes as quickly as the business may desire. If we have our teams organized to do this, then the reflection in our systems architecture will begin to evolve into something that looks like microservices.

To gain this autonomy, we need to "shed our dependencies," but that's a lot easier to say than do. I've seen folks refer to this idea in part, trivially, as "each microservice should own and control its own database and no two services should share a database." The idea is sound: Don't share a single database across services because then you run into conflicts like competing read/write patterns, data-model conflicts, coordination challenges, etc. But a single database does afford us a lot of safeties and conveniences: ACID transactions, single place to look, well-understood (kinda?), one place to manage, etc. So when building microservices, how do we reconcile these safeties with splitting up our database into multiple smaller databases?

Read Also:
The Titanic Teaches us About Data Assets and Reflexivity

Let's see. First, for an "enterprise" building microservices, we need to make the following things clear:

This seems to be ignored at a lot of places but is a huge difference between how Internet companies practice microservices and how a traditional enterprise may (or may fail because of neglecting this) implement microservices.

Before we can build a microservice, and reason about the data it uses (produces/consumes, etc), we need to have a reasonably good, crisp understanding about what that data is representing. For example, before we can store information into a database about "bookings" for our TicketMonster and its migration to microservices, we need to understand "what is a booking." Just like in your domain, you may need to understand what is an Account, or an Employee, or a Claim, etc.

To do that, we need to dig into what is "it" in reality? For example, "What is a book?" Try to stop and think about that, as it's a fairly simple example. Try to think what a book is. How would we express this in a data model?

Read Also:
IoT and Big Data – Who Owns All the Data?

Is a book something with pages? Is a newspaper a book? It has pages. So maybe a book has a hard cover? Or is not something that's released/published every day? If I write a book (which I did — Microservices for Java Developers) the publisher may have an entry for me with a single row representing my book. But a bookstore may have 5 of my books. Is each one a book? Or are they copies? How would we represent this? What if a book is so long it has to be broken down into volumes? Is each volume a book? Or all of them combined a book? What if many small compositions are combined together? Is the combination the book? Or each individual one? So basically I can publish a book, have many copies of it in a bookstore, each one with multiple volumes. So what is a book then?

The reality is there is no reality. There is no objective definition of "what is a book" with respect to reality so to answer any question like that, we have to know "who's asking the question and what is the context." Context is king.

Read Also:
What artificial intelligence will look like in 2030

 



Data Science Congress 2017

5
Jun
2017
Data Science Congress 2017

20% off with code 7wdata_DSC2017

Read Also:
How The Internet Of Things Can Make Cities More Sustainable

AI Paris

6
Jun
2017
AI Paris

20% off with code AIP17-7WDATA-20

Read Also:
How Evernote Moved 3 Petabytes of Data to Google Cloud Platform in 70 Days

Chief Data Officer Summit San Francisco

7
Jun
2017
Chief Data Officer Summit San Francisco

$200 off with code DATA200

Read Also:
IoT and Big Data – Who Owns All the Data?

Customer Analytics Innovation Summit Chicago

7
Jun
2017
Customer Analytics Innovation Summit Chicago

$200 off with code DATA200

Read Also:
Modern Data Integration for Better Decisions and Outcomes

HR & Workforce Analytics Innovation Summit 2017 London

12
Jun
2017
HR & Workforce Analytics Innovation Summit 2017 London

$200 off with code DATA200

Read Also:
HPE Is Moving to Microservices with Containers and Stackato

Leave a Reply

Your email address will not be published. Required fields are marked *