Continuing on with my series about microservices implementations (see "Why Microservices Should Be Event Driven," "Three things to make your microservices more resilient," and "Carving the Java EE Monolith: Prefer Verticals, not Layers" for background) we're going to explore probably the hardest problem when creating and developing microservices. Your data. Using Spring Boot/Dropwizard/Docker doesn't mean you're doing microservices. Taking a hard look at your domain and your data will help you get to microservices.
Follow along for the rest of the series (Twitter: @christianposta, RSS/blog: blog.christianposta.com).
Of the reasons we attempt a microservices architecture, chief among them is allowing your teams to be able to work on different parts of the system at different speeds with minimal impact across teams. So we want teams to be autonomous, capable of making decisions about how to best implement and operate their services, and free to make changes as quickly as the business may desire. If we have our teams organized to do this, then the reflection in our systems architecture will begin to evolve into something that looks like microservices.
To gain this autonomy, we need to "shed our dependencies," but that's a lot easier to say than do. I've seen folks refer to this idea in part, trivially, as "each microservice should own and control its own database and no two services should share a database." The idea is sound: Don't share a single database across services because then you run into conflicts like competing read/write patterns, data-model conflicts, coordination challenges, etc. But a single database does afford us a lot of safeties and conveniences: ACID transactions, single place to look, well-understood (kinda?), one place to manage, etc. So when building microservices, how do we reconcile these safeties with splitting up our database into multiple smaller databases?
Let's see. First, for an "enterprise" building microservices, we need to make the following things clear:
This seems to be ignored at a lot of places but is a huge difference between how Internet companies practice microservices and how a traditional enterprise may (or may fail because of neglecting this) implement microservices.
Before we can build a microservice, and reason about the data it uses (produces/consumes, etc), we need to have a reasonably good, crisp understanding about what that data is representing. For example, before we can store information into a database about "bookings" for our TicketMonster and its migration to microservices, we need to understand "what is a booking." Just like in your domain, you may need to understand what is an Account, or an Employee, or a Claim, etc.
To do that, we need to dig into what is "it" in reality? For example, "What is a book?" Try to stop and think about that, as it's a fairly simple example. Try to think what a book is. How would we express this in a data model?
Is a book something with pages? Is a newspaper a book? It has pages. So maybe a book has a hard cover? Or is not something that's released/published every day? If I write a book (which I did — Microservices for Java Developers) the publisher may have an entry for me with a single row representing my book. But a bookstore may have 5 of my books. Is each one a book? Or are they copies? How would we represent this? What if a book is so long it has to be broken down into volumes? Is each volume a book? Or all of them combined a book? What if many small compositions are combined together? Is the combination the book? Or each individual one? So basically I can publish a book, have many copies of it in a bookstore, each one with multiple volumes. So what is a book then?
The reality is there is no reality. There is no objective definition of "what is a book" with respect to reality so to answer any question like that, we have to know "who's asking the question and what is the context." Context is king.